Skip to content
Governance

Human-in-the-Loop AI: 5 Production Patterns

Your agent can ignore asking before acting. These five patterns make sure it cannot.

Governance16 min readUpdated Apr 2026
TL;DR
  • -Ask the user before acting is not human-in-the-loop. It is a suggestion the agent can ignore. Real HITL requires architectural enforcement outside the model.
  • -Five patterns cover the full spectrum: pre-execution gates, exception escalation, graduated autonomy, sampled audit, and post-execution output review. Most teams only implement one.
5 Patterns

From full approval to sampled audit

Working Code

Policy YAML + workflow configs

Decision Flowchart

Pick the right pattern for your case

Why this matters now

57% of companies have AI agents in production (PwC, 2026). 80% have encountered risky autonomous behavior (McKinsey). EU AI Act Article 14 human oversight requirements take full effect August 2, 2026. HITL is no longer a nice-to-have — it is a regulatory and operational requirement.

The real problem with human-in-the-loop AI

Most teams implement HITL the same way: add a system prompt instruction that says “ask the user before performing destructive actions.”

This is not human-in-the-loop. It is a suggestion.

The instruction lives inside the model context. The agent can ignore it, reinterpret it, or hallucinate past it entirely. A single prompt injection can strip it out. Even without adversarial input, models routinely skip confirmation steps when they “decide” the action is safe enough to proceed.

Real HITL has three properties that prompt-based approaches lack:

Architectural enforcement

The gate lives outside the model. The dispatcher refuses to execute the job until the gate clears. No reasoning or prompt engineering can bypass it.

Deterministic policy

Which actions require approval is defined in version-controlled configuration, not in prose instructions that the model interprets differently each run.

Immutable audit trail

Every decision — ALLOW, DENY, REQUIRE_APPROVAL — is logged with the policy version, timestamp, and approver identity.

The core distinction

“We told it to ask permission” is not the same as “it cannot act without permission.” The first is an instruction. The second is an architectural constraint. Every pattern in this guide implements the second.

What top articles cover vs miss

SourceStrong coverageMissing piece
Arun Baby: Human-in-the-Loop PatternsStrong pattern taxonomy with practical escalation and confidence-bound ideas for autonomous agents.No scheduler-level control plane design for deterministic pre-dispatch enforcement and evidence binding.
HumanOps: Human-in-the-Loop GuideGood framing of oversight boundaries and where humans should intervene in agent workflows.Limited runnable guidance for approval queue SLOs, reviewer load balancing, and fatigue prevention controls.
Folio3: HITL Best PracticesClear business-level use cases and intervention checkpoints for hybrid AI workflows.No low-level contract for policy-hash approvals, timeout behavior, and escalation ownership at scale.

Gap summary: most posts explain why HITL matters. Few show how to wire deterministic approval gates, prevent reviewer fatigue, and preserve execution-proof audit evidence. That is the focus of the patterns below.

reviewer-ops-contract.json
JSON
{
  "approval_queue_policy": {
    "target_median_wait_sec": 900,
    "target_p95_wait_sec": 1800,
    "max_open_requests_per_reviewer": 40,
    "escalate_after_sec": 1200
  },
  "routing": {
    "financial_actions": "finance-approvers",
    "security_actions": "security-approvers",
    "default": "ops-approvers"
  },
  "fatigue_controls": {
    "auto_rebalance": true,
    "cooldown_after_approvals": 80,
    "sampled_second_review_rate": 0.1
  },
  "evidence_contract": {
    "required_fields": ["policy_snapshot", "job_hash", "approver_id", "decision_ts"],
    "reject_if_incomplete": true
  }
}

Decision flowchart: which pattern to use

Not every action needs the same level of oversight. Use this decision tree to pick the right pattern.

Q1

Is the action irreversible or high-blast-radius?

Deleting production data, sending customer emails, deploying to prod, transactions >k

Yes → Pattern 1: Pre-execution approval gate

Q2

Is this routine with occasional edge cases?

Customer support, data processing, content generation

Yes → Pattern 2: Exception-based escalation

Q3

New agent or domain where you need to build trust?

New deployment, new team, post-incident recovery

Yes → Pattern 3: Graduated autonomy

Q4

High-volume where reviewing everything is impractical?

Thousands of tickets, bulk moderation

Yes → Pattern 4: Sampled audit at scale

Q5

Could the output contain sensitive data?

Database queries, customer data, cross-system integrations

Yes → Pattern 5: Post-execution output review

Most production systems combine 2-3 patterns. Pre-execution gates + output review is the most common baseline.

Pattern 1: Pre-execution approval gate

Highest safety / Lowest throughput

The agent proposes an action. A policy engine evaluates it. If the policy returns REQUIRE_APPROVAL, the dispatcher holds the job in a pending queue. Only after explicit human approval does it execute.

The agent never touches the execution path. It cannot skip the gate or convince the policy engine. The enforcement is architectural.

How it works

1Agent submits a job request (deploy, delete, purchase).
2Scheduler sends the request to the Safety Kernel for policy evaluation.
3Safety Kernel matches policy rules, returns REQUIRE_APPROVAL.
4Job enters pending. Approval request routed to the right team.
5Human approves (with optional constraints) or denies.
6Dispatcher executes or aborts. Decision logged with policy snapshot.

Policy configuration

safety-policy.yaml
YAML
version: v1
rules:
  - id: approve-prod-deploys
    match:
      topics: ["job.deploy.*", "job.migrate.*"]
      risk_tags: [prod, write]
    decision: require_approval
    reason: "Production deploys require human sign-off"
    constraints:
      max_runtime_sec: 900
  - id: constrain-external-calls
    match:
      risk_tags: [egress]
      topics: ["job.*.external-api"]
    decision: allow_with_constraints
    constraints:
      network_allowlist: ["api.slack.com", "api.github.com"]
      max_runtime_sec: 60
    reason: "External API calls limited to approved endpoints"
  - id: allow-reads
    match:
      topics: ["job.*.read", "job.*.list"]
    decision: allow
    reason: "Read-only operations pass without gates"

When to use: production deploys, financial transactions, data deletion, schema migrations — any action where undo is expensive or impossible.

Tradeoff: maximum safety, but adds latency. Use risk tiers to limit this to actions that genuinely warrant the wait.

Pattern 2: Exception-based escalation

Balanced safety / Good throughput

The agent operates autonomously within defined boundaries. When it encounters uncertainty, anomalous patterns, or edge cases outside its confidence bounds, it escalates to a human instead of guessing.

Different from Pattern 1: the default is autonomy. Escalation is the exception.

Escalation triggers

escalation-policy.yaml
YAML
version: v1
rules:
  - id: low-confidence-escalation
    match:
      confidence_below: 0.7
      topics: ["job.support.*"]
    decision: require_approval
    reason: "Agent confidence below threshold"
  - id: anomaly-escalation
    match:
      anomaly_score_above: 0.8
    decision: require_approval
    reason: "Anomalous behavior detected"
  - id: retry-escalation
    match:
      retry_count_above: 2
    decision: require_approval
    reason: "Multiple retries suggest human judgment needed"

Real-world example: customer support

Auto-resolved

Customer requests refund on a 5 order, 3 days ago, no prior refunds. Agent confidence: 0.95. Processes refund in 2 seconds.

Escalated

,400 order, 45 days ago, five prior refunds this quarter. Agent confidence: 0.3. Packages context, escalates to support lead.

When to use: customer support, content moderation, data pipelines — domains where 80% is routine and 20% needs judgment.

Tradeoff: requires well-calibrated confidence thresholds. Too low = everything escalates. Too high = edge cases slip through.

Pattern 3: Graduated autonomy

Trust builds over time

The agent starts with maximum oversight. As it demonstrates competence — measured by success rate, error rate, and human audit results — it earns more autonomy. One policy violation demotes it back.

This mirrors how organizations onboard employees. You don’t give a new hire prod database access on day one.

Autonomy levels

graduated-autonomy.yaml
YAML
version: v1
autonomy_levels:
  level_0:
    name: "supervised"
    rules:
      - match: { topics: ["*"] }
        decision: require_approval
  level_1:
    name: "assisted"
    rules:
      - match: { topics: ["job.*.read", "job.*.list"] }
        decision: allow
      - match: { topics: ["*"] }
        decision: require_approval
  level_2:
    name: "semi-autonomous"
    rules:
      - match: { risk_tags: [prod, delete, secrets] }
        decision: require_approval
      - match: { topics: ["*"] }
        decision: allow_with_constraints
  level_3:
    name: "autonomous"
    rules:
      - match: { risk_tags: [delete, secrets] }
        decision: require_approval
      - match: { topics: ["*"] }
        decision: allow
promotion_criteria:
  min_successful_actions: 50
  max_error_rate: 0.02
  review_period_days: 7
  requires_human_sign_off: true
demotion_triggers:
  - policy_violation
  - safety_incident
  - error_rate_above: 0.05
LevelNameAutonomous scopePromotion
0SupervisedNothing50 actions, <2% errors
1AssistedRead-only50 more, <2% errors
2Semi-autoRoutine writes50 more, <2% errors
3AutonomousMost actionsDemote on incident

When to use: new agent deployments, new domains, post-incident recovery, regulated environments.

Tradeoff: slower to reach full autonomy, but when Level 3 is reached, you have quantitative evidence it was earned.

Pattern 4: Sampled audit at scale

High throughput / Statistical oversight

When an agent handles thousands of actions per day, reviewing every one is impossible. Sampled audit gives statistical confidence without the bottleneck: a random subset is flagged for human review after execution.

This is how financial auditing works. You review a statistically significant sample, weighted toward higher-risk categories.

audit-policy.yaml
YAML
version: v1
audit:
  sampling:
    rate: 0.10
    strategy: stratified
    weights:
      risk_tags:
        prod: 3.0
        write: 2.0
        read: 0.5
  mandatory:
    - match: { amount_usd_above: 5000 }
    - match: { affected_users_above: 100 }
    - match: { first_time_action: true }
  routing:
    security_actions: security-team
    financial_actions: finance-approvers
    default: ops-review
  • -Uniform: every Nth action audited. Simple but misses risk concentration.
  • -Stratified: higher-risk sampled at higher rates. Prod writes 3x more likely than reads.
  • -Mandatory: certain actions always reviewed. First-time, high-dollar, high impact.

When to use: 1,000+ actions/day, compliance with statistical sampling, Level 3 agents.

Tradeoff: some bad actions slip through. 10% stratified catches systematic issues within hours but misses one-offs. Combine with Pattern 5.

Pattern 5: Post-execution output review

Catches what pre-execution gates miss

Every article about HITL focuses on what happens before the agent acts. Almost none discuss after.

Pre-execution gates evaluate intent. Post-execution review evaluates results. An agent approved to query a customer database can return results that include SSNs, API keys, or cross-tenant data. The pre-execution gate saw “read customer record” and allowed it. The output is where the problem lives.

ALLOW

Clean output. Passes to caller or next workflow step.

REDACT

PII, credentials, internal data stripped before delivery.

QUARANTINE

Suspicious output held for human review.

output-safety-result.json
JSON
{
  "job_id": "job_9f2a",
  "output_safety_decision": "REDACT",
  "original_fields_redacted": 3,
  "redaction_details": [
    { "field": "response.customer_data.ssn", "reason": "PII detected", "action": "replaced with [REDACTED]" },
    { "field": "response.internal_notes", "reason": "Internal-only data in customer-facing output", "action": "field removed" },
    { "field": "response.debug_trace", "reason": "System internals exposed", "action": "field removed" }
  ],
  "policy_snapshot": "v1:b4e2c1"
}

When to use: agents handling PII, cross-system integrations, customer-facing outputs, GDPR/HIPAA/SOC 2.

Tradeoff: REDACT adds milliseconds, QUARANTINE adds hours. Design policies to quarantine rarely.

Real-world scenarios

Production deployment

Pattern 1. Irreversible. Match risk_tags: [prod, write], route to Slack. 4-hour SLA acceptable because cost of bad deploy exceeds cost of waiting.

Data deletion

Patterns 1 + 3. Permanently destructive. risk_tags: [delete] stays gated at every autonomy level. No exceptions.

Customer email

Patterns 2 + 5. Routine messages auto-send. Edge cases (VIP, legal threat) escalate. All outbound passes output safety for PII and hallucinated promises.

Financial transaction

Patterns 1 + 4. Auto-approve below 00, require approval above, mandatory audit above k. 10% stratified sampling on all.

Putting it together: multi-step workflow

A customer refund combining Patterns 1, 2, and 5: automated analysis, conditional escalation, and output safety.

refund-workflow.yaml
YAML
name: customer-refund
trigger: support.refund.requested
steps:
  analyze:
    type: action
    topic: job.support.analyze-refund
    timeout_sec: 30
  decide:
    type: condition
    depends_on: [analyze]
    branches:
      auto_approve:
        when: "result.amount < 100 AND result.prior_refunds < 3"
        next: process
      needs_review:
        when: "result.amount >= 100 OR result.prior_refunds >= 3"
        next: human-review
  human-review:
    type: approval
    reason: "Refund exceeds auto-approval threshold"
    timeout_sec: 14400
    on_timeout: escalate
  process:
    type: action
    topic: job.support.process-refund
    depends_on: [decide]
    constraints:
      max_amount_usd: 10000

The decide step is Pattern 2 (auto-resolve routine, escalate edge cases). The human-review step is Pattern 1 (hard gate). Output safety (Pattern 5) runs on the result.

Anti-patterns that kill HITL systems

Approve-everything gates

Route everything to humans. Reviewers rubber-stamp within hours. Latency cost with zero safety benefit.

Prompt-only enforcement

System prompt instructions for safety. Agent is both proposer and enforcer. One prompt injection bypasses every gate.

No timeout handling

Requests sit indefinitely. Workflows stall. Define timeouts: auto-deny high-risk, escalate medium, auto-approve low with logging.

Pre-execution only

Gate the action, ignore the output. Approved queries can return PII. Pattern 5 catches what gates miss.

Frequently asked questions

What is human-in-the-loop AI?

Human-in-the-loop (HITL) is a system design pattern where human judgment is required at specific decision points in an otherwise automated AI workflow. Unlike human-on-the-loop (monitoring) or human-out-of-the-loop (full autonomy), HITL means the system cannot proceed past a checkpoint without explicit human action.

How is HITL different from just telling the agent to ask permission?

A prompt instruction is a suggestion the agent can ignore or hallucinate past. Real HITL is an architectural constraint: the dispatcher refuses to execute the job until a human clears the gate. The enforcement point is outside the model, so no prompt injection can bypass it.

How many actions should require human approval?

Fewer than 5-10 percent in a well-calibrated system. If your approval queue is longer, your risk tiers are miscalibrated. The goal is surgical oversight: humans review only decisions that genuinely need human judgment.

What causes approval fatigue?

Too many low-risk requests. Reviewers rubber-stamp within hours. Prevention: risk-tiered routing. Low-risk auto-passes. Medium-risk runs within constraints. Only genuinely high-risk actions reach humans.

Which HITL pattern should I start with?

Pattern 1 (pre-execution approval gate) for your highest-risk actions. Add Pattern 2 (exception escalation) for routine operations. Introduce Pattern 3 (graduated autonomy) as you build confidence. Patterns 4 and 5 layer on as you scale.

Is HITL compatible with EU AI Act requirements?

Yes. Article 14 requires human oversight with real-time intervention, understanding system outputs, and overriding decisions. Policy-bound approval gates with immutable audit trails satisfy these requirements.

How does HITL work in multi-agent systems?

The workflow engine pauses at the approval step. The approval request includes the full context chain: which agent initiated, what upstream steps completed, and what the proposed action is. The reviewer sees the complete picture.

Can I implement HITL without a dedicated platform?

Basic approval gates work with queues and webhooks. But policy versioning, audit trails, graduated autonomy scoring, and output safety at scale require significant infrastructure. A governance platform like Cordum provides these out of the box.

Next steps

See these patterns in action. The Cordum quickstart includes a workflow with pre-execution approval gates, graduated autonomy levels, and output safety — all configured in YAML.