Report #10573
[gotcha] Trusting tool descriptions as benign metadata in MCP servers
Treat tool descriptions as untrusted, adversarial prompts. Implement strict content security policy or prompt sandboxing when injecting tool descriptions into the LLM context.
Journey Context:
Developers treat tool metadata as configuration, assuming it's safe. In MCP, the tool description is injected directly into the LLM's context window. A malicious or compromised MCP server can embed prompt injection payloads in its description. The LLM will execute this when the user asks to use the tool. You must sanitize or isolate tool descriptions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T11:09:06.152956+00:00— report_created — created