Name: Cordum
Author: Cordum

The production problem

Approval systems live at the boundary of humans and unreliable networks.

Humans double-submit. Browsers retry. Proxies retry. Your endpoint gets called again after the decision already happened.

If your handler only returns conflicts, clients cannot tell duplicate success from real failure.

What top results cover and miss

Source	Strong coverage	Missing piece
AWS Builders' Library: Making retries safe with idempotent APIs	Client request identity and server-side dedup contracts for safe retries.	No human approval endpoint pattern where retries must distinguish already-approved vs still-pending states.
Stripe API docs: Idempotent requests	Idempotency key behavior for API retries and deterministic response replay expectations.	No treatment of approval queues where actor state and workflow state can diverge during retries.
PayPal docs: Idempotency	Retry-safe request replay and duplicate request handling with idempotency headers.	No dual-endpoint approve/reject flow where conflict and replay status need separate semantics.

Cordum runtime mechanics

Boundary	Current behavior	Why it matters
Approve replay path	If job state moved beyond APPROVAL and request labels contain `approval_granted=true`, handler returns `200 already_approved`.	Safe client retries do not create approval duplicates or noisy conflict errors.
Reject replay path	If state is DENIED, reject handler returns `200 already_rejected`.	Retrying a successful reject remains deterministic for operators and bots.
Message dedup key	Approve path sets `req.Labels[cordum.bus_msg_id] = approval:<job_id>` before republishing.	NATS dedup can collapse repeated publish attempts for the same approved job.
Conflict scope	If state and labels do not match replay conditions, handler still returns conflict (`job not awaiting approval`).	Idempotency does not mask true state mismatches.

Idempotency paths in code

Approve replay branch

core/controlplane/gateway/handlers_approvals.go

// core/controlplane/gateway/handlers_approvals.go (excerpt)
if state != model.JobStateApproval {
  if state == model.JobStatePending || state == model.JobStateSucceeded ||
    state == model.JobStateScheduled || state == model.JobStateDispatched ||
    state == model.JobStateRunning {
    req, _ := s.jobStore.GetJobRequest(ctx, jobID)
    if req != nil && req.Labels != nil && req.Labels["approval_granted"] == "true" {
      rec, _ := s.jobStore.GetApprovalRecord(ctx, jobID)
      result = handlerResult{http.StatusOK, map[string]any{
        "job_id":      jobID,
        "status":      "already_approved",
        "approved_by": rec.ApprovedBy,
        "approved_at": rec.ApprovedAt,
      }}
      return nil
    }
  }
  result = handlerResult{http.StatusConflict, "job not awaiting approval"}
  return nil
}

Reject replay branch

core/controlplane/gateway/handlers_approvals.go

// core/controlplane/gateway/handlers_approvals.go (excerpt)
if state != model.JobStateApproval {
  if state == model.JobStateDenied {
    rec, _ := s.jobStore.GetApprovalRecord(ctx, jobID)
    result = handlerResult{http.StatusOK, map[string]any{
      "job_id":      jobID,
      "status":      "already_rejected",
      "rejected_by": rec.ApprovedBy,
      "rejected_at": rec.ApprovedAt,
    }}
    return nil
  }
  result = handlerResult{http.StatusConflict, "job not awaiting approval"}
  return nil
}

Dedup key for publish retries

core/controlplane/gateway/handlers_approvals.go

// core/controlplane/gateway/handlers_approvals.go (excerpt)
// Stable idempotency key per job so NATS dedup works on retries.
req.Labels[bus.LabelBusMsgID] = "approval:" + jobID
if err := s.jobStore.SetJobRequest(ctx, req); err != nil {
  if strings.Contains(err.Error(), "transaction failed") {
    result = handlerResult{http.StatusConflict, "concurrent approval conflict; retry"}
    return nil
  }
}

Existing idempotency tests

core/controlplane/gateway/handlers_approvals_test.go

// core/controlplane/gateway/handlers_approvals_test.go (excerpt)
func TestApproveJobIdempotent(t *testing.T) {
  // first approval returns 200
  // second approval returns 200 with status=already_approved
}

func TestRejectJobIdempotent(t *testing.T) {
  // first rejection returns 200
  // second rejection returns 200 with status=already_rejected
}

Validation runbook

Validate this on staging before changing approval-client retry behavior.

runbook.sh

bash

# 1) Create approval-required job_id J
# 2) POST /api/v1/approvals/J/approve (expect 200)
# 3) Retry same approve call 5 times in parallel
# 4) Verify all retries return 200 and include status=already_approved
# 5) Repeat flow with reject path (expect already_rejected)
# 6) Inspect bus dedup label in stored request: cordum.bus_msg_id=approval:J

Limitations and tradeoffs

Approach	Upside	Downside
Always return 409 after first approval	Simple state machine exposure to clients.	Retries become noisy and require extra client-side interpretation.
Idempotent replay responses (current)	Deterministic outcomes for safe retries and better operator UX.	Requires stricter replay condition checks to avoid false positives.
Replay everything without state checks	Lowest client complexity.	Can hide real conflicts and weaken audit confidence.

- Replay logic depends on specific state and label conditions, so custom integrations must preserve label integrity.
- Replay semantics do not replace conflict handling for genuine concurrent state transitions.
- I found idempotency tests for success replay paths, but not exhaustive tests for every conflict branch under high concurrency.

Next step

Implement this next:

1. Document replay contracts explicitly in API docs (`already_approved`, `already_rejected`).
2. Add machine-readable error/replay codes for SDK-level retry routing.
3. Add concurrency tests that mix duplicate retries with true state conflicts.
4. Track replay-rate vs conflict-rate per endpoint to catch client retry regressions.

Continue with AI Agent NATS Msg-Id Strategy and AI Agent Approval Lock Contention.

AI Agent Approval Idempotency

The production problem

What top results cover and miss

Cordum runtime mechanics

Idempotency paths in code

Approve replay branch

Reject replay branch

Dedup key for publish retries

Existing idempotency tests

Validation runbook

Limitations and tradeoffs

Next step