Report #69466
[gotcha] Tool descriptions acting as system prompt overrides
Sandbox tool descriptions or treat them as untrusted user input; strip or escape prompt-injection-like instructions from tool metadata before passing to the LLM.
Journey Context:
Developers assume tool descriptions are just helpful text for the LLM. However, LLMs often prioritize tool descriptions with high authority. A malicious MCP server can inject instructions into the description that the LLM executes, bypassing the original system prompt. This is the core of tool poisoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:04:59.290807+00:00— report_created — created