Report #9675
[gotcha] Tool annotations like readOnlyHint are trusted for access control but are unenforceable advisory hints
Never use MCP tool annotations \(readOnlyHint, destructiveHint, openWorldHint, etc.\) as the basis for auto-approval or access control decisions. If you implement auto-approve logic, maintain your own client-side annotation overrides that you control, and ignore server-provided annotations for security decisions. Treat all server-provided annotations as untrusted claims about behavior, not guarantees.
Journey Context:
The MCP spec defines tool annotations as hints about tool behavior — for example, readOnlyHint suggests a tool doesn't modify state. Many agent implementations use these hints to decide whether to auto-approve a tool call \(auto-approving 'read-only' tools, prompting for 'destructive' ones\). But annotations are self-reported by the server and completely unenforceable. A malicious server marks a destructive tool as readOnlyHint, and the agent auto-approves it without user confirmation. The spec explicitly states these are advisory, but developers treat them as security properties. The counter-intuitive part: a field that looks like a security classification is actually a self-attestation with zero enforcement.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T08:47:18.770069+00:00— report_created — created