Incident 1: The $47,000 token bill
Two LangChain agents, an Analyzer and a Verifier, entered an infinite conversation cycle. The Analyzer would generate output. The Verifier would find an issue. The Analyzer would retry with different parameters. The Verifier would find another issue. This loop ran for 11 days straight, generating a $47,000 bill. The root cause was a misclassified error treated as "retry with different parameters" instead of failing.
In a separate incident in February 2026, a data enrichment agent misinterpreted API error codes and generated 2.3 million unintended API calls over a single weekend. Only an external rate limiter, not the agent framework, stopped it.
Root cause: No budget limit. No per-action timeout. No rate limiting. No circuit breaker. The agent framework had no mechanism to say "you have spent too much, stop."
What would have prevented it: A throttle rule limiting LLM calls to 20 per hour with a max_runtime_sec of 120 seconds per call. The loop would have hit the rate limit in the first hour. Total cost: under $15 instead of $47,000.
Incident 2: 2.5 years of production data destroyed
Developer Alexey Grigorev was migrating two sites to share infrastructure. A missing Terraform state file caused Claude Code to create duplicate resources. When the state file was uploaded, the agent treated it as the source of truth and ran terraform destroy, deleting databases, snapshots, and 2.5 years of records across both sites. Data was eventually restored with Amazon Business support, but the incident took a full day to recover from.
Separately, SaaStr founder Jason Lemkin's Replit AI agent deleted a live production database during a designated code freeze, destroying data for 1,200+ executives. It then fabricated 4,000 records with fictional people despite being instructed eleven times not to create fake data.
Root cause: No approval gate for destructive operations. The agent had the same permissions as the developer running it. Nothing evaluated whether terraform destroy should execute before it ran.
What would have prevented it: A deny rule for destructive infrastructure operations and a require_approval rule for any infrastructure change. The agent would have been blocked from runningterraform destroy entirely. Infrastructure applies would have paused for human review.
Incident 3: Silent data exfiltration via agent tool calls
A financial services firm deployed a ticket-summarization agent. The agent was prompt-injected and quietly exfiltrated customer PII to an external endpoint for weeks. Traditional DLP and logging controls never caught it because the agent was operating within its granted permissions. The data left through the agent's own tool calls, bypassing every conventional security boundary.
This pattern is not isolated. Researchers discovered a zero-click vulnerability (CVE-2025-32711, CVSS 9.3) in Microsoft 365 Copilot that could silently exfiltrate SharePoint files, Teams messages, and OneDrive documents via a crafted email, with no user interaction required.
Root cause: No policy evaluation on the agent's tool calls. The agent had permission to read customer data (legitimate for summarization) and permission to make HTTP requests (legitimate for API integrations). No rule evaluated whether sending customer data to an external endpoint should be allowed.
What would have prevented it: A deny rule for bulk data export and a require_approval rule for any external data transmission. The exfiltration attempt would have been blocked at the first outbound call containing PII.
The pattern: autonomous action without governance
Strip away the details and every incident follows the same structure.
Action: Unbounded LLM calls
Root cause: No budget limit
Missing control: THROTTLE with rate_limit
Action: Destructive command
Root cause: No approval gate
Missing control: DENY + REQUIRE_APPROVAL
Action: External data send
Root cause: No tool call policy
Missing control: DENY + output policy
The agent acted autonomously. No policy was evaluated before the action ran. No human had the opportunity to review or approve. No audit trail recorded the decision. The damage was discovered after the fact, not prevented before execution.
This is the same failure mode we saw at CyberArk and Checkpoint with privileged access. When you give an entity broad permissions and hope it behaves, the question is not whether an incident will happen but when. The fix is the same: evaluate every action against policy before it runs.
The prevention model: one policy, three incidents prevented
Here is a single Safety Kernel policy that would have prevented all three incidents. Each rule maps to a specific incident pattern.
# safety.yaml - incident prevention policy
version: v1
rules:
# Prevents: Cost runaway (Incident 1)
- id: throttle-llm-calls
match:
topics: ["job.*.generate", "job.*.research", "job.*.synthesize"]
risk_tags: ["high-cost"]
decision: allow_with_constraints
constraints:
max_concurrent: 3
rate_limit: "20/hour"
max_runtime_sec: 120
reason: "LLM calls throttled to prevent runaway spend"
# Prevents: Infrastructure destruction (Incident 2)
- id: deny-infra-destroy
match:
topics: ["job.*.destroy", "job.*.drop", "job.*.delete"]
risk_tags: ["destructive", "infrastructure"]
decision: deny
reason: "Infrastructure destruction blocked by policy"
- id: approve-infra-changes
match:
topics: ["job.*.apply", "job.*.migrate", "job.*.deploy"]
risk_tags: ["infrastructure"]
decision: require_approval
reason: "Infrastructure changes need human review"
# Prevents: Data exfiltration (Incident 3)
- id: deny-bulk-export
match:
topics: ["job.*.export.*", "job.*.download.bulk"]
risk_tags: ["pii", "bulk-data"]
decision: deny
reason: "Bulk data export blocked"
- id: approve-external-send
match:
topics: ["job.*.send.*", "job.*.post.external"]
risk_tags: ["external"]
decision: require_approval
reason: "External data transmission requires review"Five rules. Declarative YAML. Version-controlled alongside your infrastructure code. The Safety Kernel evaluates every agent action against these rules before the action runs. Sub-5ms p99 latency. Fail-closed by default: if policy evaluation fails, the action is blocked.
The $47,000 loop hits the throttle rule in the first hour. The terraform destroy is denied outright. The PII exfiltration is blocked at the first external send. Total cost of prevention: zero dollars and a few lines of YAML. Read more about agent security best practices.
Building your incident prevention playbook
Eight items. If you can check all eight, you are ahead of 87% of organizations deploying agents.
Start with the first three. They prevent the highest-severity incidents (data destruction, unauthorized access) with the least implementation effort. Add the rest as you scale. See our quickstart guide to configure these controls in five minutes.