Report #4572
[gotcha] MCP tool annotations are untrusted hints with pessimistic defaults and inconsistent client support
Set accurate annotations \(\`readOnlyHint\`, \`destructiveHint\`, \`idempotentHint\`, \`openWorldHint\`\) for UX and policy signals, but never treat them as a security boundary. For security, use sandboxing, network controls, and explicit approvals. Client authors: ignore annotations from untrusted servers and default to the pessimistic baseline \(non-read-only, destructive, non-idempotent, open-world\) when annotations are missing.
Journey Context:
The 2025-03-26 spec added \`ToolAnnotations\`, but explicitly calls every field a hint and says clients must treat them as untrusted unless the server is trusted. Defaults are pessimistic: missing \`readOnlyHint\` means false, missing \`destructiveHint\` means true, missing \`openWorldHint\` means true. In practice adoption is uneven — OpenAI ChatGPT connectors have ignored \`readOnlyHint=true\` and still prompted for every call. Annotations can drive confirmation UI and policy engines, but they cannot make the model resist prompt injection or prevent a malicious server from lying.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:43:38.586013+00:00— report_created — created