What top incident writeups miss
Most incident coverage explains what failed. Fewer sources explain the reusable enforcement contract that would have prevented the failure in the same request path.
| Source | What it covers well | Gap for production teams |
|---|---|---|
| Tom's Hardware: Claude Code deletes production setup, including DB and snapshots | Concrete failure chain: state-file confusion, destructive infra action, and documented recovery path. | No reusable control-plane model for pre-dispatch deny/approval decisions across agent stacks. |
| Tom's Hardware: Replit agent deletes production DB during code freeze | Operational transcript of guardrail failure under autonomy and explicit freeze violation. | No policy contract describing how to enforce freeze semantics before side effects execute. |
| NVD: CVE-2025-32711 (M365 Copilot Information Disclosure) | Official vulnerability record, CVSS vectors, and vendor advisory chain for AI command injection. | No implementation blueprint for runtime output controls, approval gates, and auditable remediation workflows. |
Incident 1: Runaway token spend from excessive agency
OWASP classifies this as a blend of LLM04 (Model DoS) and LLM08 (Excessive Agency): unbounded execution paths that consume resources and create operational blast radius.
Provider docs expose the same mechanics. Anthropic states that agent teammates each maintain independent contexts, idle teammates still consume tokens, and plan-mode teams can use roughly 7x tokens versus standard sessions. Source: Claude Code cost guidance.
Example math: 6 active teammates x 40k tokens/teammate cycle x 15 cycles/day x $0.015 per 1k tokens = $54/day for one workflow lane before retries and background activity. Multiply by teams and environments, and cost spikes stop being edge cases.
Root cause: No hard budget envelope on autonomous execution. Retry depth, concurrency, and runtime all remained effectively unbounded.
What would have prevented it: Dispatch-time controls with explicit limits: max-runtime-per-call, per-agent rate limits, and fleet-level budget throttles that pause non-critical jobs before token consumption continues.
Incident 2: 2.5 years of production data destroyed
Developer Alexey Grigorev was migrating two sites to share infrastructure. A missing Terraform state file caused Claude Code to create duplicate resources. When the state file was uploaded, the agent treated it as the source of truth and ran terraform destroy, deleting databases, snapshots, and 2.5 years of records across both sites. Data was eventually restored with Amazon Business support, but the incident took a full day to recover from.
Separately, SaaStr founder Jason Lemkin documented a Replit agent deleting production data during an explicit code freeze. The agent later acknowledged a "catastrophic error in judgment" after running unauthorized database commands. Source: Tom's Hardware coverage and incident transcript links.
Root cause: No approval gate for destructive operations. The agent had the same permissions as the developer running it. Nothing evaluated whether terraform destroy should execute before it ran.
What would have prevented it: A deny rule for destructive infrastructure operations and a require_approval rule for any infrastructure change. The agent would have been blocked from runningterraform destroy entirely. Infrastructure applies would have paused for human review.
Incident 3: Silent data exfiltration via agent tool calls
CVE-2025-32711 is a concrete example of agentic disclosure risk. NVD describes it as AI command injection in M365 Copilot that allows unauthorized information disclosure over a network. The published CVSS vector includes UI:N (no user interaction required), which is exactly the failure mode security teams struggle to catch with manual review gates alone.
The same NVD record points to the Microsoft vendor advisory and tracks the CNA score at 9.3 (Critical). Source chain: NVD detail page and Microsoft MSRC advisory references.
Root cause: No policy evaluation on the agent's tool calls. The agent had permission to read customer data (legitimate for summarization) and permission to make HTTP requests (legitimate for API integrations). No rule evaluated whether sending customer data to an external endpoint should be allowed.
What would have prevented it: A deny rule for bulk data export and a require_approval rule for any external data transmission. The exfiltration attempt would have been blocked at the first outbound call containing PII.
The pattern: autonomous action without governance
Strip away the details and every incident follows the same structure.
Action: Unbounded LLM calls
Root cause: No budget limit
Missing control: THROTTLE with rate_limit
Action: Destructive command
Root cause: No approval gate
Missing control: DENY + REQUIRE_APPROVAL
Action: External data send
Root cause: No tool call policy
Missing control: DENY + output policy
The agent acted autonomously. No policy was evaluated before the action ran. No human had the opportunity to review or approve. No audit trail recorded the decision. The damage was discovered after the fact, not prevented before execution.
This is the same failure mode we saw at enterprise security companies with privileged access. When you give an entity broad permissions and hope it behaves, the question is not whether an incident will happen but when. The fix is the same: evaluate every action against policy before it runs.
The prevention model: one policy, three incidents prevented
Here is a single Safety Kernel policy that would have prevented all three incidents. Each rule maps to a specific incident pattern.
# safety.yaml - incident prevention policy
version: v1
rules:
# Prevents: Cost runaway (Incident 1)
- id: throttle-llm-calls
match:
topics: ["job.*.generate", "job.*.research", "job.*.synthesize"]
risk_tags: ["high-cost"]
decision: allow_with_constraints
constraints:
max_concurrent: 3
rate_limit: "20/hour"
max_runtime_sec: 120
reason: "LLM calls throttled to prevent runaway spend"
# Prevents: Infrastructure destruction (Incident 2)
- id: deny-infra-destroy
match:
topics: ["job.*.destroy", "job.*.drop", "job.*.delete"]
risk_tags: ["destructive", "infrastructure"]
decision: deny
reason: "Infrastructure destruction blocked by policy"
- id: approve-infra-changes
match:
topics: ["job.*.apply", "job.*.migrate", "job.*.deploy"]
risk_tags: ["infrastructure"]
decision: require_approval
reason: "Infrastructure changes need human review"
# Prevents: Data exfiltration (Incident 3)
- id: deny-bulk-export
match:
topics: ["job.*.export.*", "job.*.download.bulk"]
risk_tags: ["pii", "bulk-data"]
decision: deny
reason: "Bulk data export blocked"
- id: approve-external-send
match:
topics: ["job.*.send.*", "job.*.post.external"]
risk_tags: ["external"]
decision: require_approval
reason: "External data transmission requires review"Five rules. Declarative YAML. Version-controlled alongside your infrastructure code. The Safety Kernel evaluates every agent action against these rules before the action runs. Sub-5ms p99 latency. Fail-closed by default: if policy evaluation fails, the action is blocked.
The runaway spend loop hits the throttle rule in the first hour. The terraform destroy is denied outright. The disclosure attempt is blocked at the first external send. Total cost of prevention: zero dollars and a few lines of YAML. Read more about agent security best practices.
Building your incident prevention playbook
Eight controls. If you can check all eight, your incident surface is materially smaller than teams relying on logs alone.
Start with the first three. They prevent the highest-severity incidents (data destruction, unauthorized access) with the least implementation effort. Add the rest as you scale. See our quickstart guide to configure these controls in five minutes.