Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.acornops.dev/llms.txt

Use this file to discover all available pages before exploring further.

AcornOps separates the operator experience, platform state, run execution, model/tool access, and workload-cluster access into distinct components. That separation keeps public traffic narrow and lets workload clusters connect outbound only.

System overview

Component responsibilities

ComponentResponsibilityPublic exposure
Management consoleWorkspace, cluster, member, tool, and session user experiencehttps://console.acornops.dev/
Control planeAuth, workspaces, clusters, agent WebSockets, run state, webhooks, API authorizationhttps://acornops.dev/api/v1
Execution engineRun lifecycle, orchestration loop, event emission, cancellation, tool-call coordinationInternal only
LLM gatewayProvider routing, run-scoped model access, MCP registry, secret lookup, gateway auditingInternal only
k8s agentWorkload-cluster snapshots, logs, and builtin Kubernetes tool executionOutbound only

Runtime flow

  1. An operator signs in through OIDC and uses the management console with a cookie-backed control-plane session.
  2. The operator creates a workspace, registers a cluster, and installs the generated k8s agent command into the workload cluster.
  3. The k8s agent authenticates with its agent key and opens an outbound WebSocket to the control plane.
  4. The agent sends heartbeats, capability metadata, and snapshots. The control plane persists cluster state and synchronizes builtin tools into the LLM gateway.
  5. When an operator sends a troubleshooting message, the control plane creates a run and dispatches it to the execution engine.
  6. The execution engine fetches run context from the control plane, streams model requests through the LLM gateway, and calls allowed tools.
  7. The control plane records run events and streams them back to the management console.

Auth and trust boundaries

AcornOps uses separate credentials for separate trust boundaries:
ChannelCredential
Browser to control planeSession cookie
Control plane to execution engineEXECUTION_ENGINE_DISPATCH_TOKEN
Execution engine to control planeORCH_SERVICE_TOKEN
Control plane to LLM gateway admin APILLM_GATEWAY_ADMIN_TOKEN
Execution engine to LLM gateway runtime APIControl-plane-signed run JWT
k8s agent to control planeCluster agent key
Run-scoped JWTs include workspace, cluster, session, run, allowed provider, allowed model, allowed tool, and output-budget claims. The LLM gateway rejects requests whose body scope does not match the token scope.

Workspace and role model

A workspace owns members, clusters, MCP server settings, tool settings, sessions, runs, and webhooks. Server responses include role-derived permissions so clients can render actions without reimplementing authorization rules.
RoleTypical capabilities
ownerManage workspace, owners, admins, clusters, tools, MCP servers, keys, read-only runs, and read-write runs
adminManage non-owner members, clusters, tools, MCP servers, keys, read-only runs, and read-write runs
operatorCreate sessions, create read-only runs, read logs when allowed, cancel runs
viewerRead workspace, cluster, session, and run data
The control plane prevents membership changes that would leave a workspace without an owner.

Data ownership

DataOwner
Workspaces, members, clusters, sessions, runs, invitations, webhooksControl plane
Run reservations and worker coordinationExecution engine with Redis
Provider credentials, MCP registry, gateway request recordsLLM gateway
Live Kubernetes discovery and builtin tool behaviork8s agent
The agent snapshot is persisted by the control plane and exposed to the management console. Snapshot branches include resources, events, and metrics when available.

High availability posture

Management console, execution engine, and LLM gateway can run with multiple replicas when backed by external Postgres and Redis. The default platform chart sets the control plane to one replica because WebSocket routing and background scheduling are process-local. The workload-cluster agent supports active-passive high availability through Kubernetes Lease leader election. When replicaCount is greater than one, enable leader election so exactly one agent runtime connects at a time.