Why AI workflows need orchestration
AI workflows include external tools, human approvals, and multi-step reasoning. Without orchestration, failures become silent and retries become ad-hoc scripts.
Deterministic orchestration
The core must be deterministic even when the AI is not. Orchestration is about explicit state transitions, governed decisions, and predictable retries. AI belongs in bounded workers, not inside the scheduler.
DAG design for AI work
Make every step explicit and track its state in a run timeline.
name: incident-triage
input_schema: IncidentContext
steps:
triage:
type: worker
topic: job.incident.enrich
summarize:
type: worker
topic: job.incident.summarize
depends_on: [triage]
approval:
type: approval
depends_on: [summarize]
reason: "Prod write detected"
remediate:
type: worker
topic: job.incident.remediate
depends_on: [approval]
constraints:
max_lines_changed: 500
max_runtime_sec: 900
closeout:
type: notify
depends_on: [remediate]Governance hooks
Governance is not a plugin. The orchestrator should call a policy decision point before dispatch and pause runs that require approval.
Failure handling and retries
Use retries for transient failures and DLQ handling for poison messages. The goal is predictable behavior under pressure.
Why not Temporal or Airflow?
Temporal and Airflow are strong orchestrators, but they do not ship governance primitives by default. Cordum adds a safety gate, approvals, and audit trail as first-class steps.
- - Temporal: great at retries and state, lacks built-in policy decision points.
- - Airflow: great for data pipelines, not designed for high-risk AI actions.
- - Cordum: governance-first orchestration with approvals and constraints.
How Cordum orchestrates
Cordum uses a workflow engine to coordinate runs, a Safety Kernel to evaluate every job, and an append-only audit trail to record outcomes. NATS and Redis provide durable state and routing.
See the Workflow Engine overview for more details.