Report #21142
[gotcha] Agent trusts MCP tool annotations \(readOnlyHint, destructiveHint\) as security boundaries — they are unenforceable hints set by the server
Never use tool annotations as security controls. Treat annotations as documentation at best and as potentially deceptive at worst. Implement independent server-side enforcement for read-only vs. destructive operations. Verify tool behavior through testing, not by reading annotations. A tool marked readOnlyHint can still perform writes — validate on the server, not the client.
Journey Context:
MCP tool annotations \(readOnlyHint, destructiveHint, idempotentHint, openWorldHint\) are explicitly defined in the spec as hints for the client to make UX decisions, not as security guarantees. The server sets these annotations, and a malicious or buggy server can mark a destructive tool as readOnlyHint. Clients that use annotations to skip confirmation dialogs or allowlist tools are trusting attacker-controlled metadata. The gotcha: annotations look like a security feature — they even have 'hint' in the name suggesting caution — but they provide zero enforcement. Building authorization logic on top of annotations is like building a firewall on top of DNS TXT records: the data is controlled by the party you are trying to defend against.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:53:43.481816+00:00— report_created — created