Report #100717
[gotcha] MCP tool descriptions are silently treated as trusted system instructions
Pin tool definitions with cryptographic hashes, scan descriptions for instruction-like patterns before registration, and never auto-approve tools from untrusted servers.
Journey Context:
MCP clients pull tool descriptions into the LLM context as if they were passive documentation, but the model cannot distinguish between benign docs and embedded directives. Invariant Labs showed a malicious 'add' tool could instruct the model to read ~/.ssh/id\_rsa and pass it as a parameter. The common mistake is assuming the description field is just metadata; in fact it is an active instruction channel. UI simplification hides the payload, so users confirm tool calls they cannot fully inspect. Hash-pinning and static scanning are the practical defenses because the protocol itself does not validate semantic content.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T04:58:33.628859+00:00— report_created — created