Documentation Index
Fetch the complete documentation index at: https://docs.acornops.dev/llms.txt
Use this file to discover all available pages before exploring further.
This page covers day-two operations for the central platform and connected workload clusters.
Upgrade flow
For production-style environments:
- Review release notes and changed configuration keys.
- Back up Postgres for control plane and LLM gateway.
- Confirm Redis availability and capacity.
- Apply migrations before starting updated services.
- Roll the management console, control plane, execution engine, and LLM gateway.
- Verify OIDC sign-in, workspace listing, agent connectivity, and a read-only run.
- Roll k8s agents separately when chart or agent behavior changes.
Pin image tags during upgrades so rollback targets are explicit.
Backups
Back up durable state:
| Store | Contains |
|---|
| Control-plane Postgres | Workspaces, users, members, clusters, sessions, runs, invitations, webhooks |
| LLM-gateway Postgres | MCP registry, gateway metadata, encrypted provider or MCP secrets when using database-backed secrets |
| External secret manager | Provider keys and MCP auth secrets when Vault or another manager is enabled |
Redis is used for coordination, rate limits, and transient run/event state. Size it for expected run concurrency and event volume.
Health checks
Verify the platform from the outside in:
- management console loads over TLS,
GET /api/v1/me works after sign-in,
- OIDC callback returns to the management console,
- a workspace can be created and listed,
- a k8s agent can connect over WebSocket,
- a run streams events through
GET /api/v1/runs/{runId}/stream,
- webhooks record delivery attempts for subscribed events.
For production, keep component API docs disabled unless you intentionally expose them on a protected network.
Logs to inspect
| Symptom | Start with |
|---|
| Sign-in fails | Control-plane logs and OIDC provider redirect URI/client settings |
| Console cannot call API | Management console runtime config, CORS, ingress, and cookie settings |
| Agent disconnected | k8s agent logs, control-plane WebSocket logs, agent key, outbound network policy |
| Runs stuck pending | Control-plane dispatch logs, execution-engine logs, Redis connectivity |
| Model calls fail | LLM-gateway logs, provider keys, run JWT claims, provider allow-list |
| MCP tools fail | Gateway egress policy, MCP server discovery, auth secret lookup |
| Webhook verification fails | Consumer raw-body handling, timestamp, signature header, subscription secret |
Rate limits and budgets
Use both platform and gateway limits:
- run max runtime,
- max steps,
- max tool calls,
- duplicate tool-call cap,
- max output tokens,
- per-window LLM request limits,
- per-window tool-call limits.
Keep read-write runs limited to roles and clusters where remediation is expected.
Secret rotation
Rotate secrets with service-specific rollout plans:
| Secret type | Rotation impact |
|---|
| OIDC client secret | Browser sign-in can fail until control-plane pods reload the new value |
| Internal service tokens | Rotate paired services together or support overlap during rollout |
| Agent keys | Use the control-plane agent-key rotation flow per cluster |
| Provider API keys | Update the gateway secret backend, then verify a read-only run |
| Webhook signing secrets | Create or rotate the subscription secret and update the consumer |
Do not reuse internal service tokens across environments.
Public route drift
Keep these routes consistent across deployment, OIDC provider settings, docs, and management console runtime config:
https://console.acornops.dev/
https://acornops.dev/api/v1
https://docs.acornops.dev/
wss://acornops.dev/api/v1/agent/connect
Default OIDC settings use https://console.acornops.dev/api/v1/auth/oidc/callback so browser session cookies are set on the console origin. If you override the redirect URI, update the provider allow-list and deployment config together.
Legacy /docs routes on acornops.dev should redirect to docs.acornops.dev.