Skip to content
Guide

AI Agent Transactional Outbox Pattern

Database committed but event lost is still one of the most expensive bugs in production.

Guide11 min readMar 2026
TL;DR
  • -Dual writes are still dual writes when an LLM is the caller.
  • -Outbox records and business records must commit in one transaction.
  • -Outbox solves atomic publish intent; consumers still need idempotency.
Atomic intent

Persist state change and event metadata together

Ordered relay

Forward committed outbox rows in deterministic order

Idempotent consume

Handle duplicate delivery safely downstream

Scope

This guide covers control-plane and workflow systems where one operation updates state and must publish an event for autonomous downstream processing.

The production problem

Autonomous agent platforms constantly write state and emit events. If those two writes are not atomic, eventual consistency turns into eventual incident.

Classic failure pattern: DB commit succeeds, event publish fails, downstream systems never observe the change. A retry can make things worse if the event publishes twice.

What top results miss

SourceStrong coverageMissing piece
AWS transactional outbox patternExcellent dual-write failure framing, ordering, duplicate handling, and CDC alternatives.No agent control-plane specifics for policy decisions and approval-gated actions.
Debezium Outbox integrationPractical CDC outbox implementation details with configurable event routing.Limited guidance on runtime governance for autonomous agent operations.
Azure transactional outbox with Cosmos DBClear service/worker split and processed-marker flow for guaranteed event publishing.Does not map outbox strategy to multi-agent execution and policy controls.

Outbox model for agents

StepRequired designFailure if missing
TransactionWrite business row + outbox row in the same DB transaction.State commits while event is lost, or event emits for rolled-back state.
Relay workerRead committed outbox rows, publish events, mark processed.Rows accumulate indefinitely or publish order becomes unstable.
DeduplicationUse message IDs and idempotent consumers downstream.At-least-once delivery duplicates side effects.
ObservabilityTrack outbox lag, publish failures, and stuck rows.Silent drift between committed state and downstream systems.

Cordum runtime mapping

MappingCurrent behaviorWhy it matters
Job + event persistenceRedis-backed job metadata and event logs with bus publish flowSupports durable state + event sequencing in control-plane operations.
At-least-once bus deliveryJetStream durable subjects with idempotent handlers and locksOutbox or relay consumers must expect duplicate deliveries.
DLQ integrationTerminal failures route to DLQ entries with indexed storageProvides recovery path for relay/publish failures.
Policy before dispatchSubmit/dispatch path evaluates allow/deny/approve/throttlePrevents unsafe outbox-triggered actions from bypassing governance.

Implementation examples

Atomic write transaction (SQL)

tx.sql
SQL
BEGIN;
  INSERT INTO workflow_run(id, state, payload) VALUES ($1, 'pending', $2);
  INSERT INTO outbox(id, aggregate_id, event_type, payload, created_at)
  VALUES ($3, $1, 'run.created', $4, NOW());
COMMIT;

Relay worker loop (Python)

relay.py
Python
for row in select_unprocessed_outbox(batch=100):
    publish(row.event_type, row.payload, message_id=row.id)
    mark_processed(row.id, processed_at=now())

Outbox event record (JSON)

outbox-event.json
JSON
{
  "outbox_id": "obx_73a",
  "aggregate_id": "run_510",
  "event_type": "run.created",
  "published": true,
  "publish_attempts": 2,
  "processed_at": "2026-03-31T19:40:33Z"
}

Limitations and tradeoffs

  • - Outbox adds write amplification and requires relay infrastructure.
  • - Polling relays are simpler and can increase publish latency under load.
  • - CDC relays reduce polling overhead and add operational complexity.
  • - Outbox does not guarantee exactly-once consume; downstream idempotency stays mandatory.

Next step

Run this rollout in one sprint:

  1. 1. Identify top three dual-write paths in your agent workflows.
  2. 2. Add outbox rows in the same transaction as state updates.
  3. 3. Deploy a relay with publish retry + processed marker updates.
  4. 4. Add idempotency checks to all consumers of relayed events.

Continue with AI Agent Idempotency Keys and AI Agent DLQ and Replay Patterns.

One transaction, one truth

If state and event disagree, recovery cost compounds across every downstream automation.