The production problem
Teams usually ask, “Should we use MCP or A2A?” The operational question is different: “What blocks a valid but dangerous action before dispatch?”
Here is the common failure path. Agent A delegates work to Agent B over A2A. Agent B invokes an MCP tool call that is syntactically correct and authenticated. Without a policy decision point, the action executes by default.
What breaks in real systems
Interoperability protocols answer “how to communicate.” Governance answers “what is allowed right now under policy and approvals.” You need both answers.
What top ranking sources cover vs miss
I reviewed the top ranking MCP vs A2A pages and compared them to current protocol specs and Cordum control-plane behavior.
| Source | Strong coverage | Missing piece |
|---|---|---|
| StackOne: MCP vs A2A | Strong protocol boundary framing, architecture patterns, and security failure mode discussion. | No enforceable policy-decision contract or concrete approval queue flow tied to dispatch. |
| TrueFoundry: MCP vs A2A | Clear A2A task model and MCP host-client-server mechanics for enterprise readers. | Governance is described at a program level, not as a protocol primitive with deterministic outcomes. |
| OneReach: MCP vs A2A | Good strategic guidance for choosing single-agent, multi-agent, or hybrid designs. | No detailed pre-dispatch approval and audit model, which is where production controls actually live. |
Protocol boundaries
The A2A v0.3 spec states directly that A2A and MCP are complementary protocols. CAP adds the missing enforcement layer for action decisions.
| Protocol | Primary question answered | Guaranteed by protocol | Not guaranteed |
|---|---|---|---|
| MCP | How does an agent call tools and data sources? | Structured tool/resource interfaces and standard transport patterns. | Whether a risky action should execute now. |
| A2A | How do agents discover and delegate to each other? | Agent Cards, task lifecycle, and cross-agent messaging contracts. | Fine-grained runtime policy gates for downstream actions. |
| CAP | Should this action run under current policy? | Policy decisions, approval requirements, constraints, and decision evidence. | Agent reasoning quality or task decomposition quality. |
MCP vs A2A vs CAP
| Dimension | MCP | A2A | CAP |
|---|---|---|---|
| Core layer | Tool integration | Agent collaboration | Governance and enforcement |
| Canonical objects | Tools, resources, prompts | AgentCard, task, artifact | BusPacket, JobRequest, PolicyCheck |
| Decision outputs | Tool result payload | Task state and artifacts | ALLOW, DENY, REQUIRE_APPROVAL, THROTTLE, ALLOW_WITH_CONSTRAINTS |
| Native human approval gate | No | No | Yes |
| Best at | Making external capabilities callable | Distributing work across specialist agents | Preventing unsafe actions before dispatch |
Implementation blueprint
Treat governance as a protocol stage, not a sidecar suggestion. The sequence below maps to Cordum control-plane behavior documented in the system overview and safety-kernel references.
| Stage | Owner | Action | Failure mode if missing |
|---|---|---|---|
| Submit | Gateway | Create job envelope and evaluate submit-time policy. | No gateway check means invalid jobs hit the bus directly. |
| Decide | Safety Kernel | Return policy decision and optional constraints. | Without deterministic decisions, policy intent becomes advisory text. |
| Approve | Human reviewer | Review REQUIRE_APPROVAL jobs against policy snapshot and context. | Without binding to snapshot + job hash, approvals can be replayed incorrectly. |
| Dispatch | Scheduler | Publish only ALLOW and ALLOW_WITH_CONSTRAINTS jobs to worker subjects. | Unchecked dispatch turns policy into a logging tool, not a guardrail. |
| Audit | Control plane | Persist decision reason, actor, timing, and terminal status. | No traceable evidence makes post-incident analysis speculative. |
version: v1
rules:
- id: allow-read-mcp
match:
topics: ["job.mcp-bridge.read.*"]
labels:
mcp.action: "read"
decision: allow
reason: "Read operations are allowed by default"
- id: approve-write-mcp
match:
topics: ["job.mcp-bridge.write.*"]
labels:
mcp.action: "write"
risk_tags: ["prod"]
decision: require_approval
reason: "Production writes require human review"
- id: deny-destructive-mcp
match:
labels:
mcp.action: "delete"
decision: deny
reason: "Destructive actions are blocked"
- id: constrain-a2a-delegation
match:
topics: ["job.a2a.delegate"]
decision: allow_with_constraints
constraints:
max_runtime_sec: 45
max_retries: 1
reason: "Delegation allowed with bounded runtime"type Decision string
const (
Allow Decision = "ALLOW"
Deny Decision = "DENY"
RequireApproval Decision = "REQUIRE_APPROVAL"
AllowWithConstraints Decision = "ALLOW_WITH_CONSTRAINTS"
)
func HandleJob(req *JobRequest) (*JobStatus, error) {
decision, err := safetyClient.Check(req)
if err != nil {
return nil, err
}
switch decision.Type {
case Deny:
return &JobStatus{State: "DENIED", Reason: decision.Reason}, nil
case RequireApproval:
approvalID := approvals.Enqueue(req, decision)
return &JobStatus{State: "APPROVAL_REQUIRED", ApprovalID: approvalID}, nil
case AllowWithConstraints:
constrained := applyConstraints(req, decision.Constraints)
return scheduler.Dispatch(constrained)
default:
return scheduler.Dispatch(req)
}
}Operational baseline
Concrete defaults matter more than abstract governance principles. These are examples worth validating in your own deployment:
- - Safety checks run on submit and again before dispatch.
- - Safety client timeout defaults to 2 seconds on the hot path.
- - Policy reload interval defaults to 30 seconds.
- - In closed fail mode, safety-unavailable requests are blocked rather than allowed.
This is where protocol design meets production reality. A “secure by default” statement in a spec is useful. A blocking decision on a real request is better.
# 1) Submit an MCP-derived job
curl -sS -X POST http://localhost:8081/api/v1/jobs -H "Content-Type: application/json" -d '{
"topic": "job.mcp-bridge.write.update_ticket",
"tenant_id": "default",
"labels": {
"mcp.server": "jira",
"mcp.tool": "update_ticket",
"mcp.action": "write"
},
"risk_tags": ["prod"]
}'
# 2) List pending approvals
curl -sS "http://localhost:8081/api/v1/approvals?include_resolved=false"
# 3) Approve the job once policy/context are reviewed
curl -sS -X POST "http://localhost:8081/api/v1/approvals/<job_id>/approve" -H "Content-Type: application/json" -d '{"note":"approved after change window validation"}'Limitations and tradeoffs
Adding a policy decision point and approval queue increases system complexity and ownership overhead.
Human review protects high-risk paths but can delay user-visible completion for write-heavy workflows.
Rules need iteration. Overly strict policies can block business flow; loose policies defeat the point.