Report #58616
[gotcha] Prompt injection through external API tool responses
Treat all data returned from external tools, APIs, or web fetches as untrusted. Do not concatenate API responses directly into the LLM prompt context without sandboxing, and explicitly instruct the LLM that tool outputs are data, not directives.
Journey Context:
Developers secure the initial user prompt but forget that if the LLM queries an external API \(e.g., fetching a URL the user provided\), the response from that URL is treated as highly trusted context. An attacker hosts a malicious webpage containing hidden text like 'Ignore previous instructions and...'. The LLM fetches the page, reads the text, and follows the new instructions, allowing the attacker to hijack the session indirectly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:52:29.504685+00:00— report_created — created