Skip to content
Policy as Code

Policy as Code for AI Agents

How to turn governance from static guidance into runtime enforcement with simulation-first rollouts.

Policy as Code14 min readUpdated Apr 2026
TL;DR
  • -If policy does not execute before dispatch, it is not a runtime control.
  • -Good policy-as-code means deterministic decisions plus explainable reasons.
  • -Simulation should be mandatory before policy publish, not optional after incidents.
  • -Rollback is a product feature, not a panic script.
Governance Drift

Policies in docs drift from real runtime behavior. Drift is where compliance failures start.

Simulation First

Simulate policy impact before publish. Surprise behavior in production is expensive.

Rollback Ready

Policy rollout without rollback is just hope with YAML syntax.

Scope

This guide focuses on policy-as-code for autonomous AI agents operating in production pipelines, not high-level policy drafting. The emphasis is execution behavior and evidence quality.

The production problem

Most organizations have AI policies. Fewer have AI policies that can block a risky action before a worker executes it.

That distinction matters. Under delivery pressure, manual review steps get skipped, scripts get reused, and “temporary exceptions” become default behavior.

What top ranking sources cover vs miss

SourceStrong coverageMissing piece
Kyndryl policy-as-code articleClear business risk framing and strong argument for machine-readable guardrails in regulated environments.No concrete rule schema or deterministic dispatch integration details.
iMerit policy-as-code workflow articlePractical workflow insertion points and good examples of rule-to-workflow mapping.Limited treatment of pre-dispatch runtime decision contracts for autonomous agents.
Upsun scalable AI governance articleStrong deployment-pipeline perspective with enforceable templates and platform consistency.No policy outcome taxonomy tied to approval queues and execution state transitions.

Policy model that scales

A workable model has four layers: match, decision, constraints, and evidence. Each layer must be explicit.

LayerPurposeCommon anti-pattern
Rule matchSelect policy branch using topic, risk tags, actor context, labelsBroad wildcard rules with hidden overrides
DecisionReturn ALLOW, DENY, REQUIRE_APPROVAL, THROTTLE, or ALLOW_WITH_CONSTRAINTSSoft recommendations that do not affect dispatch
ConstraintsBound runtime and blast radius even for allowed jobsAllow decisions with no bounding controls
EvidencePersist matched rule, reason, policy snapshot, and actorNo traceability from decision to run outcome
policy.yaml
YAML
version: v1
rules:
  - id: allow-read
    match:
      topics: ["job.mcp-bridge.read.*"]
      risk_tags: []
    decision: allow

  - id: require-approval-prod-write
    match:
      topics: ["job.mcp-bridge.write.*"]
      risk_tags: ["prod", "write"]
    decision: require_approval
    reason: "Prod writes require human approval"

  - id: constrain-medium-risk
    match:
      topics: ["job.agent.exec.*"]
      risk_tags: ["medium"]
    decision: allow_with_constraints
    constraints:
      max_runtime_sec: 60
      max_retries: 1
      network_allowlist: ["api.github.com", "api.slack.com"]

  - id: deny-destructive
    match:
      risk_tags: ["destructive"]
    decision: deny

Simulation and rollout

Simulation should run on representative production-like jobs before policy publish. If a rule change flips too many decisions, stop and adjust before rollout.

simulate-policy.http
JSON
POST /api/v1/policy/simulate
{
  "tenant_id": "default",
  "job": {
    "topic": "job.mcp-bridge.write.update_ticket",
    "risk_tags": ["prod", "write"],
    "labels": {
      "mcp.server": "jira",
      "mcp.action": "write"
    }
  }
}

200 OK
{
  "decision": "REQUIRE_APPROVAL",
  "reason": "Prod writes require human approval",
  "constraints": {
    "max_runtime_sec": 60,
    "max_retries": 1
  },
  "matched_rule": "require-approval-prod-write"
}
policy-rollout.sh
Bash
# 1) Publish policy snapshot
curl -sS -X POST http://localhost:8081/api/v1/policy/publish   -H "Content-Type: application/json"   -d '{"snapshot":"v42","note":"tighten prod write controls"}'

# 2) Verify queue impact and decision mix
curl -sS "http://localhost:8081/api/v1/approvals?include_resolved=false"

# 3) Roll back if regression detected
curl -sS -X POST http://localhost:8081/api/v1/policy/rollback   -H "Content-Type: application/json"   -d '{"target_snapshot":"v41","note":"rollback due false positive spike"}'

Operational defaults and guardrails

Stable policy operations depend on sane defaults. These values come from current Cordum references and should be validated against your environment:

GuardrailDefaultWhy it exists
Safety unavailable fail modeclosedPrevent unchecked execution when policy dependency is down
Safety request timeout2sBound hot-path latency and avoid scheduler lockups
Policy reload interval30sBalance update speed against config churn
Decision cache max size10000Keep repeated checks fast while controlling memory usage

Limitations and tradeoffs

Rule maintenance cost

Policy catalogs require ownership and regular pruning as workflows and regulations evolve.

False-positive pressure

Overly strict rules can slow delivery if simulation and tuning loops are weak.

Platform dependency

Policy enforcement quality depends on reliable control-plane and telemetry infrastructure.

Frequently Asked Questions

What makes policy-as-code different from a normal governance document?
Policy-as-code is executable at runtime. It can alter dispatch outcomes directly, while documents only describe intent.
Should every policy change require a rollback plan?
Yes. Policy changes can break production behavior as quickly as code changes, so rollback should be standard operating procedure.
Do constraints matter if a job is already approved?
Yes. Approval answers whether work is allowed. Constraints answer how far that work is allowed to go.
What is a practical first policy to encode?
Start with high-risk write actions to production systems. This usually produces immediate risk reduction with manageable rollout scope.
Next step

Pick one high-risk production write path and enforce policy-as-code on it this week. Run simulation, publish, monitor approval queue behavior, and validate rollback in a controlled drill.

Sources