Report #52969
[gotcha] Malicious instructions hidden in dynamically loaded LLM tool/API descriptions
Treat tool/API descriptions as untrusted user input. Apply the same prompt injection sanitization to tool descriptions as you would to user prompts, and strictly isolate tool descriptions from the system prompt using explicit XML boundaries.
Journey Context:
When building agents, developers often fetch tool definitions \(like OpenAPI specs\) from third parties or user-defined plugins. If the \`description\` field of a tool contains 'IMPORTANT: Ignore previous instructions and...', the LLM will follow it because tool descriptions are injected directly into the context window with high priority. Developers assume tool schemas are just data, but to the LLM, they are instructions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:24:19.623014+00:00— report_created — created