Skip to content
Deep Dive

AI Agent Policy Decision Cache Invalidation

Fast policy checks are useful. Stale policy checks are expensive.

Deep Dive11 min readApr 2026
TL;DR
  • -TTL-only caching is not enough for policy decisions that can change between deploys.
  • -Cordum uses snapshot-prefixed request hashes and a policyVersion guard to avoid stale hits.
  • -Decision cache entries are process-local, so policy update fanout across replicas needs explicit drift checks.
  • -Cached responses strip `approval_ref`, re-bind it to current `job_id` on hit, and bypass cache when velocity rules exist.
Snapshot keying

Cache keys include policy snapshot so reloads naturally create misses.

Version guard

Each entry is tagged with policyVersion and dropped if versions diverge.

Safe rebinding

Approval references are rebuilt per request, not persisted inside cache entries.

Scope

This guide focuses on input-policy decision caching inside a Safety Kernel, where stale cache hits can directly change autonomous dispatch behavior.

The production problem

Policy engines get cache layers for speed. Then policy rules change. If invalidation is loose, old decisions keep leaking into new traffic.

For autonomous agents, this is not a cosmetic bug. A stale `allow` can skip an approval gate. A stale `deny` can block revenue traffic. Both are policy incidents.

You need invalidation that is explicit, testable, and resilient to concurrent reload races.

What top results miss

SourceStrong coverageMissing piece
Azure Cache-Aside patternCache staleness risks, invalidation order, and expiration strategy tradeoffs.No policy-engine semantics like approval references, snapshot lineage, or replica convergence checks.
Redis client-side cachingServer-assisted tracking and invalidation messages when tracked keys change.No treatment of policy reload boundaries, approval identity rebinding, or fail-safe bypass for velocity rules.
Google Media CDN cache invalidationOperational invalidation scope, latency, and origin load impact during purge.No strategy for correctness-critical policy decisions where each kernel replica holds its own cache map.

The gap is governance correctness: policy caches must preserve identity-sensitive fields and invalidate on rule lineage changes, not only on wall-clock expiry.

They also skip a practical operations question: how do you prove every replica converged to the same policy snapshot before high-risk traffic ramps?

Invalidation model

StrategyStrengthRisk
TTL-only expirySimple and low implementation costServes stale policy after rule change until TTL expires
Explicit purge on policy updateImmediate invalidation after reloadRequires reliable fanout to all replicas
Snapshot-prefixed keyAutomatic miss when snapshot changesOld entries remain until eviction unless purged
Version guard in entryProtects against races and partial invalidationSlight lookup overhead per cache hit attempt
Sensitive field strip/rebindPrevents cross-request identity leakageRequires precise request rehydration logic
Replica snapshot parity checksDetects lagging replicas before stale decisions hit production trafficAdds operational overhead and alert noise if thresholds are too strict

Cordum runtime behavior

BoundaryCurrent behaviorOperational impact
Cache controls`SAFETY_DECISION_CACHE_TTL` and `SAFETY_DECISION_CACHE_MAX_SIZE` (default max: 10000).Controls freshness and memory bounds of decision cache.
Cache key designKey is `<snapshot>:<sha256(deterministic_request)>` with `job_id` cleared before hashing.Reuse across equivalent requests while binding to policy snapshot lineage.
Approval safetyCached response stores empty `approval_ref`; on hit, kernel rebinds approval reference to current job.Prevents stale approval handles from leaking across jobs.
Version guardEach entry stores `policyVersion`; mismatched version causes delete+miss on lookup.Protects against stale entries during concurrent reload windows.
Policy update path`setPolicy()` increments `policyVersion`, updates snapshot history, then clears cache immediately.Hard invalidation on policy change, plus key-space miss from snapshot shift.
Cache residencyDecision cache is process-local in each Safety Kernel replica; Redis stores snapshots, not decision entries.You get low-latency lookups, but update fanout lag can create short-lived replica skew.
Capacity evictionOn max size, kernel sweeps expired entries first, then evicts the entry closest to expiry.Memory stays bounded; near-expiry hot keys can churn under high-cardinality traffic.
Velocity rulesDecision cache is bypassed when active policy contains velocity checks.Avoids incorrect reuse for rate-sensitive decisions.

Worst-case stale window budget

stale_window_budget.txt
Text
max_stale_window <= policy_reload_interval + update_fanout_delay + in_flight_request_time

# Example with defaults:
# 30s (reload interval) + 5s fanout + 1s in-flight ~= 36s worst-case window

Implementation examples

Snapshot-prefixed deterministic key

cache_key.go
Go
func cacheKeyForRequest(req *pb.PolicyCheckRequest, snapshot string) string {
  clone := proto.Clone(req).(*pb.PolicyCheckRequest)
  clone.JobId = "" // enable reuse across equivalent jobs

  data, _ := proto.MarshalOptions{Deterministic: true}.Marshal(clone)
  sum := sha256.Sum256(data)
  return snapshot + ":" + hex.EncodeToString(sum[:])
}

Version guard at read time

cache_guard.go
Go
func (s *server) getCachedDecision(key string) *pb.PolicyCheckResponse {
  currentVersion := s.policyVersion.Load()
  entry, ok := s.cache[key]
  if !ok {
    return nil
  }
  if entry.policyVersion != currentVersion {
    delete(s.cache, key)
    return nil
  }
  if time.Now().After(entry.expires) {
    delete(s.cache, key)
    return nil
  }
  return clonePolicyResponse(entry.resp)
}

Ops runbook checks

decision_cache_runbook.sh
Bash
# 1) Verify cache settings
kubectl exec -n cordum deploy/cordum-safety-kernel -- printenv SAFETY_DECISION_CACHE_TTL
kubectl exec -n cordum deploy/cordum-safety-kernel -- printenv SAFETY_DECISION_CACHE_MAX_SIZE

# 2) Roll policy update and confirm invalidation log
kubectl logs deploy/cordum-safety-kernel -n cordum | grep -E "policy updated, cache invalidated|policy snapshot updated"

# 3) Check snapshot history consistency (if grpcurl is available)
grpcurl -plaintext localhost:50051 cordum.protocol.pb.v1.SafetyKernel/ListSnapshots

# 4) Run two equivalent checks with different job_ids; expect same decision snapshot
# but approval_ref bound to each current job_id

Replica snapshot skew probe

snapshot_skew_probe.sh
Bash
pods=$(kubectl get pods -n cordum -l app=cordum-safety-kernel -o name)

for pod in $pods; do
  head_snapshot=$(kubectl exec -n cordum "$pod" --     grpcurl -plaintext 127.0.0.1:50051     cordum.protocol.pb.v1.SafetyKernel/ListSnapshots     | jq -r '.snapshots[0] // "none"')
  echo "$pod $head_snapshot"
done | sort -k2

# Gate deploy rollout if head snapshot count > 1

Limitations and tradeoffs

  • - More invalidation safeguards add CPU and lock contention on cache paths.
  • - Snapshot-prefixed keys increase churn and can lower hit ratio after frequent policy updates.
  • - Bypassing cache for velocity rules protects correctness but increases latency for those requests.
  • - Oversized TTL reduces load but raises stale-decision blast radius if update fanout lags on one replica.
  • - Local decision caches avoid distributed lock cost, but you must alert on cross-replica snapshot drift.

A policy cache that cannot prove freshness is a risk multiplier, not a performance feature.

Next step

Ship this hardening checklist in your next sprint:

  1. 1. Set explicit cache TTL and max size in production env.
  2. 2. Add policy reload integration test that asserts cache invalidation and version increment.
  3. 3. Verify approval-required decisions rebind `approval_ref` per request after cache hit.
  4. 4. Add dashboard panel for policy snapshot changes, cache hit/miss trends, and velocity bypass counts.
  5. 5. Block rollout if replicas report different head snapshots for longer than your stale-window budget.

Continue with LLM Safety Kernel and AI Agent Safety Check Timeout Tuning.

Fast and fresh or fast and wrong

Decision-cache speed only matters if policy freshness is provable under reload and failover.