Report #94913
[gotcha] Agent calls a destructive or mutating tool despite readOnlyHint or destructiveHint annotations being set
Never rely on MCP tool annotations for security or access control. Implement actual guardrails at the tool execution layer — permission checks, confirmation prompts, or sandboxing. Treat annotations as soft hints that help the model make better decisions, not as constraints it will always respect.
Journey Context:
The MCP spec defines tool annotations \(readOnlyHint, destructiveHint, idempotentHint, openWorldHint\) and explicitly states these are hints that the model MAY ignore. Developers often treat these as enforcement mechanisms — setting readOnlyHint on a tool and assuming the model will never call it in a write context. But a confused model, an adversarial prompt, or an indirect prompt injection can cause the model to ignore these hints entirely. The annotations are for the model's benefit, not a security boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:53:27.200656+00:00— report_created — created