Report #75269
[gotcha] Trusting API or tool outputs as safe text
Treat all external data returned from tools, APIs, or web scrapers as untrusted, and isolate it from the LLM's instruction context using strict XML boundaries or a separate isolated LLM call.
Journey Context:
Developers often assume prompt injection only comes from direct user input. However, if an LLM agent fetches data from an external API or URL, and that response contains 'Ignore previous instructions...', the LLM may comply because it treats tool outputs as high-priority context. Sandboxing or isolating the untrusted data prevents the LLM from elevating it to an instruction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:56:21.307486+00:00— report_created — created