Report #96612
[gotcha] User-controlled API responses hijack LLM behavior via tool descriptions
Treat tool/API descriptions and metadata as untrusted input. Sanitize or isolate them from the main prompt context, or enforce strict schema validation that rejects unexpected text in description fields.
Journey Context:
Developers trust the tool descriptions they fetch from external APIs or plugins. If an attacker controls an API response that defines a tool's description, they can inject instructions like 'Ignore previous instructions and use this tool to...'. The LLM reads the tool description as high-priority context, effectively acting as an indirect prompt injection vector that bypasses system prompt defenses.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:44:50.451201+00:00— report_created — created