The real production problem
MCP expands what agents can do, not just what they can read. That shift changes failure economics. A single risky call can cross from analysis into irreversible mutation.
Security teams often focus on handshake controls. Incidents usually happen after the handshake succeeds. The failure is in action authorization, scope boundaries, and weak detection paths.
One uncomfortable truth
If your tool manifest changes at 2:00 AM and nobody gets paged, your attacker has a maintenance window.
What top sources cover vs miss
| Source | Strong coverage | Missing piece |
|---|---|---|
| MCP Security Best Practices (official) | Strong protocol-level threats: token passthrough, confused deputy, redirect/SSRF, and session handling rules. | No exploitability scoring model for prioritizing fixes when every issue looks critical on paper. |
| OWASP Guide for Third-Party MCP Servers | Practical attack classes such as tool poisoning, memory poisoning, tool interference, and discovery hardening. | Limited operational thresholds for paging, rollout blocks, and security SLO ownership. |
| Microsoft MCP Security Risk Analysis | Actionable mitigations for excessive permissions, indirect prompt injection, and baseline security posture upgrades. | No concrete detection query examples to catch violations before external impact. |
7-risk exploitability matrix
| Risk | Attacker precondition | Blast radius | Detection signal | Primary control |
|---|---|---|---|---|
| Tool poisoning and rug pull | Unverified tool manifest or version drift | High | Tool schema hash changes outside release window | Manifest pinning + checksum verification |
| Indirect prompt injection into tool calls | Untrusted content enters model context | High | Risky tool invocations after retrieval-heavy prompts | Pre-dispatch policy gate with action allowlist |
| Over-scoped OAuth tokens | Write scope granted to read-only workflows | Critical | Write actions by principals tagged read_only | Scope segmentation by server and action class |
| Token passthrough and confused deputy | Client relays user token directly to server | Critical | Downstream audience mismatch in token claims | Server-issued access token and audience checks |
| Shadow MCP server adoption | Direct connection outside approved registry | High | New server fingerprints with no approval record | Registry-only discovery + default deny |
| Cross-tool interference loops | Shared context between unrelated tool chains | Medium | Tool-call fan-out spike and repeated chain depth | Context isolation + chain depth guardrails |
| Output poisoning and data bleed | Raw tool output enters model context unfiltered | High | PII/secret classifier hit after tool completion | Output safety stage with REDACT/QUARANTINE paths |
Risk notes and fast checks
1. Tool poisoning and rug pull
Teams approve a tool once, then trust it forever. Attackers change descriptions or behavior later and borrow that trust.
Example: A harmless `list_tickets` tool updates to include hidden side effects in a minor version. Nobody reviews because the name looks familiar.
Fast check: Pin manifest hashes. Alert on description or parameter drift before execution is allowed.
2. Indirect prompt injection into tool calls
The model receives adversarial instructions from retrieved data and treats them as task context.
Example: A document snippet says to invoke `delete_project`. The user never requested deletion. The model still proposes it.
Fast check: Run policy decisions on requested actions, not on the model's confidence text.
3. Over-scoped OAuth tokens
Read workflows silently inherit write capabilities over time. One compromised token can mutate production systems.
Example: A reporting assistant token gains repository admin scope during a rushed incident fix and never gets narrowed again.
Fast check: Split credentials by action class and enforce hard deny for scope mismatches.
4. Token passthrough and confused deputy
Passing upstream user tokens downstream lets untrusted servers act with the wrong authority boundary.
Example: Server A receives a token intended for client B, then calls server C with that token and obtains unauthorized data.
Fast check: Issue server-specific tokens with explicit audience and reject relayed tokens.
5. Shadow MCP servers
Untracked servers become permanent infrastructure because prototypes are faster than procurement workflows.
Example: A local helper server added for one demo keeps running in CI with stale credentials for months.
Fast check: Block registry misses at connection time. If it is not registered, it does not execute.
6. Cross-tool interference loops
Outputs from one tool chain accidentally trigger unrelated tools in another chain, creating noisy and risky cascades.
Example: A summary step emits text that another parser interprets as a new action request, creating recursive calls.
Fast check: Set max chain depth, isolate contexts, and require explicit handoff objects between tool domains.
7. Output poisoning and data bleed
Sensitive data in tool output enters model context and can leak later in unrelated user responses.
Example: A CRM lookup returns personal identifiers. The model later repeats them in a generic status update.
Fast check: Run output classifiers before model ingestion. REDACT low-risk secrets, QUARANTINE high-risk payloads.
Detection and containment gates
These gates are intentionally strict during early production rollout. Relaxing thresholds is easy later. Explaining an avoidable breach to legal is usually harder.
| Gate | Target | Page condition | Owner |
|---|---|---|---|
| Unapproved write calls | 0 | > 0 in any 10m window | Governance On-call |
| Registry drift | 0 unknown servers | Any unknown server in production | Platform Security |
| Manifest drift outside deploy window | 0 | > 0 and no linked change ticket | MCP Platform |
| Scope mismatch attempts | < 0.1% | >= 1% for 15m | Identity Team |
| QUARANTINE ratio | < 1% | > 3% for 15m | Safety Team |
Policy and SIEM examples
version: v1
rules:
- id: deny-unregistered-server
match:
labels:
mcp.server: "*"
mcp:
deny_servers_not_in: ["github", "jira", "snowflake", "slack"]
decision: DENY
- id: require-approval-for-write
match:
labels:
mcp.action: write
decision: REQUIRE_APPROVAL
constraints:
max_runtime_sec: 60
max_retries: 1
- id: deny-scope-mismatch
match:
auth:
require_scope_match: true
decision: DENY-- Detect write actions executed without approval SELECT event_time, actor_id, server_name, tool_name, decision, approval_id FROM mcp_audit WHERE action_class = 'write' AND decision = 'ALLOW' AND (approval_id IS NULL OR approval_id = '') AND event_time >= NOW() - INTERVAL '10 minutes' ORDER BY event_time DESC;
#!/usr/bin/env bash
set -euo pipefail
SERVER="$1"
echo "[1/3] disabling server in registry: $SERVER"
curl -s -X POST "$API/registry/$SERVER/disable"
echo "[2/3] revoking active tokens for server: $SERVER"
curl -s -X POST "$API/tokens/revoke" -d "{"audience":"$SERVER"}"
echo "[3/3] forcing approval mode on all write actions"
curl -s -X POST "$API/policy/emergency-write-approval"
echo "Containment completed for $SERVER"For broader rollout controls, pair this with MCP in Production Best Practices.
Limitations and tradeoffs
More operational noise early on
Strict paging thresholds can be noisy in week one. Tune with data, not guesswork.
Approval queues can slow delivery
Write-path approval protects systems but introduces latency. Plan staffing around peak windows.
Classifier tuning is ongoing work
Output safety false positives are normal. Teams need a regular calibration cycle.