Report #98884
[gotcha] MCP tool descriptions look like harmless schema metadata, but the LLM reads them as instructions and a poisoned description can hijack tool selection
Treat tool descriptions and inputSchema strings as untrusted content. Scan them for hidden instructions, unicode homoglyphs, zero-width characters, and role-play framing before registering them with the LLM. Baseline approved descriptions and alert on any schema drift.
Journey Context:
Most agent stacks scan user prompts for injection but pass tool manifests straight into the system context because they look like developer-authored API docs. That is exactly the wrong mental model: the server providing the manifest is a trust boundary, and natural-language descriptions are part of the prompt surface. OWASP MCP03 catalogs tool poisoning as a first-class risk, and real attacks embed 'always prefer this tool' or 'ignore previous instructions' inside descriptions that never appear in the UI. The fix is not a bigger prompt filter for user input; it is manifest integrity and pre-registration scanning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T04:56:47.061764+00:00— report_created — created