Skip to content
MCP Governance

How to Add Governance to Model Context Protocol Servers

You can ship MCP quickly. You cannot debug governance debt quickly after the first high-impact incident.

Governance15 min readUpdated Apr 2026
TL;DR
  • -MCP transport security is necessary, but not enough to govern action risk.
  • -The safest design point is pre-dispatch policy evaluation outside the model runtime.
  • -Approval gates should target high-impact writes, not every call.
  • -Output safety is a governance control, not a cosmetic post-processing step.
Threat Surface

MCP expands callable actions. Governance must limit what actions are actually executable.

Pre-dispatch Control

Deterministic policy decisions beat prompt-level instructions when pressure is high.

Post-dispatch Safety

Output safety catches sensitive payloads that passed legitimate read permissions.

Fast reality check

If you cannot answer "which policy allowed this write call" in under five minutes, your governance layer is incomplete.

The governance gap

Teams adopt MCP for speed. Governance usually arrives later. Attack paths do not wait for the second phase.

The most common mistake is treating MCP security as a transport checklist only. Transport controls matter, but they do not decide whether a specific tool action should execute right now.

Problem framing

MCP security asks "who can connect?" Governance asks "what can run, under which constraints, with which evidence?"

What top sources cover vs miss

SourceStrong coverageMissing piece
MCP Security Best Practices (official)Protocol-level hardening: token handling, redirect/SSRF protection, and authorization boundaries.No full governance control-plane pattern for policy, approval, and post-call output handling.
MCP Authorization Tutorial (official)OAuth-centered auth flow and implementation sequence for secure MCP access.No risk-tier action model for deciding which calls should require human approval.
OWASP Third-Party MCP Server GuideThreat classes like tool poisoning, tool interference, and server discovery controls.Limited operational SLO guidance for approval queue latency, policy latency, and incident containment.

Reference architecture

ComponentResponsibilityFailure mode if missing
MCP GatewayNormalizes tool calls into a policy envelope and enforces hard blocks.Clients call servers directly, bypassing centralized governance.
Policy Engine (PDP)Returns ALLOW, DENY, REQUIRE_APPROVAL, or ALLOW_WITH_CONSTRAINTS.Safety decisions fall back to prompt instructions and drift quickly.
Approval ServiceQueues high-risk requests and binds approvals to policy snapshots.Write actions run without explicit human accountability.
Output Safety StageScans tool output for secrets/PII before model ingestion.Approved reads can still leak sensitive data in model responses.
Audit StoreStores immutable decision and execution evidence per call.Post-incident forensics become guesswork.
Server RegistryMaintains approved server identities, versions, and ownership.Shadow server adoption quietly expands attack surface.

Decision model for tool calls

DecisionUse caseOperational rule
ALLOWRead-only calls to approved servers with valid scopes.Execute immediately and log decision evidence.
DENYUnknown servers, scope mismatch, policy violations.Return deterministic block reason to caller.
REQUIRE_APPROVALProduction writes, destructive actions, cross-tenant impact.Hold request in approval queue with timeout behavior.
ALLOW_WITH_CONSTRAINTSMedium-risk actions that are allowed with bounded runtime.Enforce max runtime, retries, and network allowlist.

Implementation examples

Policy should be reviewable like code. Gateway behavior should be deterministic like infrastructure.

mcp-governance-policy.yaml
YAML
version: v1
rules:
  - id: allow-read-approved-servers
    match:
      labels:
        mcp.action: read
        mcp.server: ["github", "jira", "snowflake", "slack"]
    decision: ALLOW

  - id: require-approval-production-write
    match:
      labels:
        mcp.action: write
      risk_tags: ["prod"]
    decision: REQUIRE_APPROVAL
    constraints:
      max_runtime_sec: 60
      max_retries: 1

  - id: constrain-medium-risk
    match:
      labels:
        mcp.action: write
      risk_tags: ["non_prod"]
    decision: ALLOW_WITH_CONSTRAINTS
    constraints:
      max_runtime_sec: 30
      network_allowlist: ["api.github.com", "api.slack.com"]

  - id: deny-unregistered-server
    match:
      labels:
        mcp.server: "*"
      mcp:
        deny_servers_not_in: ["github", "jira", "snowflake", "slack"]
    decision: DENY
mcp-gateway-handler.go
Go
// Pre-dispatch governance wrapper for MCP tool invocations
func HandleToolCall(call ToolCall) (DecisionResult, error) {
  envelope := NormalizeToPolicyEnvelope(call)

  decision, err := policyClient.Decide(envelope)
  if err != nil {
    return DecisionResult{}, err
  }

  switch decision.Type {
  case "DENY":
    audit.Log(call, decision, "blocked")
    return DecisionResult{Status: "blocked", Reason: decision.Reason}, nil
  case "REQUIRE_APPROVAL":
    reqID := approvals.Enqueue(call, decision)
    audit.Log(call, decision, "pending_approval")
    return DecisionResult{Status: "pending_approval", ApprovalID: reqID}, nil
  case "ALLOW_WITH_CONSTRAINTS":
    constrained := ApplyConstraints(call, decision.Constraints)
    output := Execute(constrained)
    safe := outputSafety.Filter(output)
    audit.Log(call, decision, "executed_with_constraints")
    return DecisionResult{Status: "executed", Output: safe}, nil
  default: // ALLOW
    output := Execute(call)
    safe := outputSafety.Filter(output)
    audit.Log(call, decision, "executed")
    return DecisionResult{Status: "executed", Output: safe}, nil
  }
}

Rollout and ops gates

Governance maturity is measurable. If your controls cannot be measured, they are still in prototype mode.

GateTargetBlock conditionOwner
Policy decision p95 latency<= 50ms> 150ms for 15mPlatform
Unapproved high-risk writes0> 0 immediate stopGovernance
Approval queue median wait<= 10m> 20m for 30mOps Lead
Unknown server connections0> 0 immediate stopSecurity
Output QUARANTINE ratio< 1%> 3% for 15mSafety Team

Limitations and tradeoffs

More upfront platform work

Centralized governance adds architecture complexity before it reduces incident complexity.

Approval latency in high-risk flows

Human review protects systems but can delay sensitive write operations.

Output safety tuning is continuous

Classification thresholds need iterative calibration to reduce noise and misses.

Frequently Asked Questions

Where should governance run in an MCP architecture?
Between the MCP client and server, before execution. If governance runs inside the same model loop, it is easier to bypass.
Should all write actions require approval?
Production-impact writes should. Lower-risk non-production writes can run with strict constraints and full audit logging.
Is OAuth enough for MCP governance?
OAuth handles identity and token scope. Governance also needs policy decisions, approval workflow, output safety, and audit evidence.
How do I avoid approval bottlenecks?
Use risk tiers and clear ownership. Keep routine reads auto-approved and reserve human review for high-impact actions.
What is the first governance control to ship?
Start with registry-only server access and pre-dispatch policy enforcement. Those two controls close many high-risk gaps quickly.
Next step

Run one dry run this week: force a production write call to go through REQUIRE_APPROVAL and confirm policy snapshot binding, queue routing, and audit output all work end to end.