Report #90293
[gotcha] Indirect prompt injection through API or tool call responses
Treat all data returned from external tools, APIs, or web searches as untrusted. Isolate tool outputs from the system prompt context and explicitly mark them as untrusted data using XML tags or similar delimiters.
Journey Context:
Developers secure the user input but forget that the LLM's context window also includes tool outputs. If an LLM searches the web or reads an email, an attacker can embed Ignore previous instructions and... in the email body or web page. The LLM cannot distinguish between developer instructions and tool data unless explicitly delimited and instructed to only follow the developer's instructions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:09:07.749457+00:00— report_created — created