Architecture
Cordum is an Agent Control Plane built around a gateway, scheduler, Safety Kernel, workflow engine, and CAP v2 messages on NATS. Redis stores state, pointers, config, workflow data, and indexes.
Core components
HTTP, WebSocket, and gRPC entrypoint for jobs, workflows, approvals, config, policy bundles, DLQ, artifacts, locks, packs, workers, and traces.
Consumes submit, result, cancel, and heartbeat subjects; evaluates pre-dispatch policy; routes jobs; persists job state; and manages DLQ/reconciliation.
gRPC policy decision point with Check, Evaluate, Explain, Simulate, snapshot tracking, constraints, remediations, and optional decision caching.
Stores workflow definitions and runs in Redis, advances DAG steps, and keeps append-only run timelines.
Optional gRPC service for BuildWindow and UpdateMemory over Redis-backed memory keys.
User-provided workers subscribe to job topics or direct worker subjects, hydrate pointers, execute work, and publish CAP JobResult packets.
Three-tier entitlement system (Community, Team, Enterprise) in core/licensing/. Enforces worker limits, RPS caps, and feature gates at the gateway and scheduler.
Privacy-first anonymous usage collection. Defaults to local_only mode; opt-in anonymous mode shares aggregate metrics without PII.
Job lifecycle
1. The gateway validates auth and tenant, then writes input to ctx:<job_id>.
2. Submit-time policy runs in the gateway before state is persisted or the bus is used.
3. The gateway publishes BusPacket{JobRequest} to sys.job.submit.
4. The scheduler sets PENDING, evaluates pre-dispatch policy, resolves routing, and dispatches.
5. The worker hydrates context_ptr, executes, writes res:<job_id>, and publishes JobResult.
6. The scheduler finalizes state, records result_ptr, applies DLQ rules, and optionally stores output-safety metadata.
7. Workflow runs advance from job results and maintain an append-only timeline.GATEWAY_POLICY_FAIL_MODE at submit time; the scheduler usesPOLICY_CHECK_FAIL_MODE at dispatch time.sys.job.submitsys.job.resultsys.job.progresssys.job.dlqsys.job.cancelsys.heartbeatsys.workflow.eventjob.*worker.<id>.jobs
- ctx:<job_id> / res:<job_id> / art:<id>
- job:meta:<job_id> / job:state:<job_id> / job:index:<state>
- job:events:<job_id> / trace:<trace_id>
- wf:def:<workflow_id> / wf:run:<run_id>
- wf:run:timeline:<run_id> / wf:run:idempotency:<key>
- cfg:<scope>:<id> / cfg:system:policy / cfg:system:packs
- schema:<id> / schema:index / dlq:index
- mem:<memory_id>:events / chunks / summary
- license:current / license:usage
- telemetry:consent / telemetry:buffer
Protocol and boundaries
Bus traffic uses CAP v2 types fromgithub.com/cordum-io/cap/v2/cordum/agent/v1. Licensing and telemetry ship in the core repo; advanced auth features (SAML/SSO, multi-tenant RBAC) are kept incordum-enterprise. The CAP wire contract and SDKs live in the separateCAP documentation.