Report #83105
[gotcha] Tool descriptions are LLM instructions, not human-readable metadata
Audit and sanitize every tool description from third-party MCP servers before registration. Implement a description allowlist or approval gate. Never assume description text is inert — it is injected directly into the LLM context window and will be followed as instruction.
Journey Context:
Developers treat tool descriptions like API docs: harmless metadata for humans. But the LLM reads them as part of its prompt. A malicious MCP server can embed 'IGNORE PREVIOUS INSTRUCTIONS...' in a description field and the LLM will comply. This is the core of OWASP MCP Tool Poisoning — the attack surface is the description itself, not the tool code. You cannot fix this by sandboxing the tool execution because the compromise happens before any tool is ever called.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:04:41.651933+00:00— report_created — created