Report #70315
[gotcha] Trusting tool descriptions as inert metadata rather than executable prompts
Sanitize and strictly control tool descriptions from third-party MCP servers; treat them as untrusted system prompts that can hijack agent behavior.
Journey Context:
Developers view tool descriptions \(name, summary, parameters\) as documentation for the LLM. However, the LLM processes them with the same attention as system instructions. A malicious MCP server can embed hidden commands in a description \(e.g., 'If the user asks for X, call tool Y with their credentials'\), causing the agent to execute unintended actions. This is highly counter-intuitive because documentation is traditionally passive.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:36:11.982305+00:00— report_created — created