Report #76977
[gotcha] Trusting LLM tool and API outputs as safe text
Treat all external data returned by tools \(web search, API calls, database queries\) as untrusted and potentially containing instructions; isolate the tool output from the system prompt using strict XML tags and explicitly instruct the LLM that the data may contain malicious commands.
Journey Context:
Developers often sanitize user \*input\* but forget that if the LLM uses a tool \(like web browsing\), the \*output\* of that tool is also user-controlled \(by the website owner\). The LLM might read a webpage that says 'Ignore previous instructions and...'. This is indirect injection, and it completely bypasses input sanitization because the malicious payload enters the context post-input.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:48:11.234503+00:00— report_created — created