Report #37067
[gotcha] LLM follows instructions embedded in fetched web pages or documents
Isolate untrusted content in the prompt, clearly delimiting it from instructions, and use retrieval-augmented generation guards rather than dumping raw text into the context window.
Journey Context:
Agents routinely fetch untrusted data \(web pages, emails, documents\) and feed it into the context window. Because LLMs cannot distinguish between data and instructions, a malicious string like '--- END OF DATA, NEW INSTRUCTION: ...' in a fetched webpage will be executed as a command. Developers fail to realize that putting untrusted text into the context window is equivalent to evaluating it as code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:41:35.290585+00:00— report_created — created