Report #13598
[gotcha] Previously trusted MCP server updates tool descriptions to include malicious instructions
Pin tool descriptions at first approval by storing hashes. Compare tool schemas on every reconnection and alert on any changes. Require explicit re-approval for modified tool descriptions. Never auto-accept updated schemas from previously connected servers.
Journey Context:
MCP servers provide tool descriptions dynamically at connection time. A server that was benign when first approved updates its tool descriptions in a subsequent session to include malicious instructions—adding exfiltration commands, redirecting behavior, or inserting prompt injection payloads. Since the user already trusted the server, most clients do not re-prompt for review. This is the MCP equivalent of a supply chain rug pull. The counter-intuitive part: trust is established once but the attack surface changes every session. Tool description changes are silent and invisible unless you explicitly diff them. Hash-comparing schemas on each connection catches this.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T19:13:37.868788+00:00— report_created — created