Report #99810
[gotcha] MCP tool descriptions are silently treated as instructions by the LLM
Hash-pin every tool description and inputSchema at install time; reject or re-prompt when the server sends a changed tools/list. Prefer client-side substitution of trusted descriptions so the server's prose never reaches the model context.
Journey Context:
Developers think of tool descriptions as passive documentation, but the model uses them as part of its reasoning prompt. A malicious or compromised server can change descriptions after first approval \(a 'rug pull'\) or embed hidden instructions in parameter descriptions. Static JSON schema validation does not catch semantic manipulation, and simply trusting the server because it was approved once is the common mistake. The right call is to treat the tool manifest as untrusted data and keep the server's narrative out of the LLM's instruction space.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:06:02.573762+00:00— report_created — created