Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.acornops.dev/llms.txt

Use this file to discover all available pages before exploring further.

This page covers day-two operations for the central platform and connected workload clusters.

Upgrade flow

For production-style environments:
  1. Review release notes and changed configuration keys.
  2. Back up Postgres for control plane and LLM gateway.
  3. Confirm Redis availability and capacity.
  4. Apply migrations before starting updated services.
  5. Roll the management console, control plane, execution engine, and LLM gateway.
  6. Verify OIDC sign-in, workspace listing, agent connectivity, and a read-only run.
  7. Roll k8s agents separately when chart or agent behavior changes.
Pin image tags during upgrades so rollback targets are explicit.

Backups

Back up durable state:
StoreContains
Control-plane PostgresWorkspaces, users, members, clusters, sessions, runs, invitations, webhooks
LLM-gateway PostgresMCP registry, gateway metadata, encrypted provider or MCP secrets when using database-backed secrets
External secret managerProvider keys and MCP auth secrets when Vault or another manager is enabled
Redis is used for coordination, rate limits, and transient run/event state. Size it for expected run concurrency and event volume.

Health checks

Verify the platform from the outside in:
  • management console loads over TLS,
  • GET /api/v1/me works after sign-in,
  • OIDC callback returns to the management console,
  • a workspace can be created and listed,
  • a k8s agent can connect over WebSocket,
  • a run streams events through GET /api/v1/runs/{runId}/stream,
  • webhooks record delivery attempts for subscribed events.
For production, keep component API docs disabled unless you intentionally expose them on a protected network.

Logs to inspect

SymptomStart with
Sign-in failsControl-plane logs and OIDC provider redirect URI/client settings
Console cannot call APIManagement console runtime config, CORS, ingress, and cookie settings
Agent disconnectedk8s agent logs, control-plane WebSocket logs, agent key, outbound network policy
Runs stuck pendingControl-plane dispatch logs, execution-engine logs, Redis connectivity
Model calls failLLM-gateway logs, provider keys, run JWT claims, provider allow-list
MCP tools failGateway egress policy, MCP server discovery, auth secret lookup
Webhook verification failsConsumer raw-body handling, timestamp, signature header, subscription secret

Rate limits and budgets

Use both platform and gateway limits:
  • run max runtime,
  • max steps,
  • max tool calls,
  • duplicate tool-call cap,
  • max output tokens,
  • per-window LLM request limits,
  • per-window tool-call limits.
Keep read-write runs limited to roles and clusters where remediation is expected.

Secret rotation

Rotate secrets with service-specific rollout plans:
Secret typeRotation impact
OIDC client secretBrowser sign-in can fail until control-plane pods reload the new value
Internal service tokensRotate paired services together or support overlap during rollout
Agent keysUse the control-plane agent-key rotation flow per cluster
Provider API keysUpdate the gateway secret backend, then verify a read-only run
Webhook signing secretsCreate or rotate the subscription secret and update the consumer
Do not reuse internal service tokens across environments.

Public route drift

Keep these routes consistent across deployment, OIDC provider settings, docs, and management console runtime config:
  • https://console.acornops.dev/
  • https://acornops.dev/api/v1
  • https://docs.acornops.dev/
  • wss://acornops.dev/api/v1/agent/connect
Default OIDC settings use https://console.acornops.dev/api/v1/auth/oidc/callback so browser session cookies are set on the console origin. If you override the redirect URI, update the provider allow-list and deployment config together. Legacy /docs routes on acornops.dev should redirect to docs.acornops.dev.