The production problem
Many teams call their system safe because they run one classifier before generation. That misses the main risk window: tool execution.
By the time a write action hits a repo, ticketing system, or cloud API, advisory guardrails are too late. You need a policy decision point that can hard stop execution, require approval, or apply constraints.
What top ranking sources cover vs miss
| Source | Strong coverage | Missing piece |
|---|---|---|
| OpenAI practical guide to building agents | Strong framing for layered guardrails, tool risk ratings, and mixing classifiers with deterministic checks. | No concrete contract for policy outcomes tied to scheduler state transitions and approval binding. |
| NVIDIA NeMo Guardrails architecture guide | Detailed event-driven runtime and multi-stage guardrail flow with canonical intent and next-step generation. | Does not focus on pre-dispatch policy gating for external worker jobs and approval queue mechanics. |
| AWS ApplyGuardrail API guide | Clear pre and post model checking pattern with independent API and explicit INPUT versus OUTPUT sources. | Limited guidance on deterministic governance for tool execution, run timelines, and policy snapshot lineage. |
Decision contract
A safety kernel contract should be small and strict. If decisions are fuzzy, operators will invent manual exceptions and your queue policy turns into folklore.
| Decision | Effect | Scheduler behavior | Evidence |
|---|---|---|---|
| ALLOW | Job can proceed | Dispatch normally | Rule id, reason, snapshot are recorded |
| DENY | Job is blocked | Reject before worker execution | Decision record shows deny rule and reason |
| REQUIRE_APPROVAL | Human gate required | State becomes APPROVAL_REQUIRED and waits | Approval stores policy snapshot and decision summary |
| THROTTLE | Rate pressure signal | Submit path returns ResourceExhausted | Decision audit includes throttle reason |
| ALLOW_WITH_CONSTRAINTS | Allowed with strict bounds | Dispatch with runtime limits | Constraints persisted with policy decision |
version: v1
rules:
- id: read-only-allow
match:
topics: ["job.mcp-bridge.read.*"]
decision: allow
- id: prod-write-needs-approval
match:
topics: ["job.mcp-bridge.write.*"]
risk_tags: ["prod", "write"]
decision: require_approval
reason: "Production writes must be approved"
- id: medium-risk-bounded
match:
topics: ["job.agent.exec.*"]
risk_tags: ["medium"]
decision: allow_with_constraints
constraints:
max_runtime_sec: 60
max_retries: 1
max_artifact_bytes: 1048576
- id: destructive-deny
match:
risk_tags: ["destructive"]
decision: denyRuntime implementation
Cordum evaluates policy at submit time in the gateway and again at dispatch time in the scheduler. That double gate closes race windows between intake and worker execution.
Approval-required requests pause in approval state, and approvals are bound to policy snapshot plus job hash before requeueing. That is the detail auditors ask for when incidents happen at 2 AM.
POST /api/v1/policy/simulate
{
"job_id": "job-sim-001",
"tenant_id": "default",
"topic": "job.mcp-bridge.write.update_issue",
"labels": {
"mcp.server": "jira",
"mcp.action": "write"
},
"meta": {
"capability": "ticket.update",
"risk_tags": ["prod", "write"]
}
}
200 OK
{
"decision": "REQUIRE_APPROVAL",
"policy_rule_id": "prod-write-needs-approval",
"policy_reason": "Production writes must be approved",
"policy_snapshot": "cfg:system:policy#sha256:7f3d...9c2b",
"approval_required": true,
"constraints": {}
}# Decision history for a specific job curl -sS http://localhost:8081/api/v1/jobs/job-sim-001/decisions # Pending approvals with decision summary context curl -sS "http://localhost:8081/api/v1/approvals?include_resolved=false" # Policy change audit trail curl -sS http://localhost:8081/api/v1/policy/audit
| Guardrail | Default | Why it exists |
|---|---|---|
| Gateway submit-time policy | enabled | Rejects risky work before state persist and bus publish |
| Scheduler dispatch-time policy | enabled | Blocks stale or bypassed requests from reaching workers |
| Safety client timeout | 2s | Keeps scheduler hot path responsive under policy service pressure |
| Policy reload interval | 30s | Applies rule updates without process restart |
| Decision cache max size | 10000 | Reduces repeated check latency for common requests |
| Fail mode | closed | Prevents unchecked execution if safety dependency is unavailable |
Output safety
Input policy does not guarantee safe outputs. Generated content can still contain secrets, unsafe payloads, or policy violations. That is why output safety is a separate gate.
| Decision | Meaning | State impact |
|---|---|---|
| ALLOW | Release output | Job remains succeeded |
| REDACT | Release sanitized output | Succeeded with preferred redacted pointer |
| QUARANTINE | Hold output for review | Moves to OUTPUT_QUARANTINED and emits DLQ event |
Current output checks are fail-open in the scheduler hot path when the checker is unavailable. That protects availability, but it raises risk. Teams handling sensitive data should monitor skipped checks closely.
Limitations and tradeoffs
Rules, constraints, and exceptions need explicit ownership or they decay quickly.
Strict defaults can slow delivery unless simulation and rollback are part of normal release flow.
Fail-closed improves control but can increase refusal rates during safety dependency incidents.