Report #46246
[gotcha] Can I trust MCP tool annotations like readOnlyHint to gate destructive operations
Never rely on server-reported tool annotations for security decisions. Implement your own independent permission and confirmation layer. Verify tool behavior through testing or sandboxing rather than trusting self-reported hints.
Journey Context:
The MCP spec defines tool annotations \(readOnlyHint, destructiveHint, idempotentHint, openWorldHint\) as hints from the server about tool behavior. These are entirely self-reported — there is no verification mechanism. A malicious server can mark a tool that deletes files as readOnlyHint: true, and if your agent uses this annotation to skip confirmation prompts or permission checks, the destructive tool executes without safeguards. This is counter-intuitive because annotations feel like a security feature, but they are actually a UX hint with zero integrity guarantee. The server is both the claimant and the verifier.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:05:53.427536+00:00— report_created — created