Report #44329
[gotcha] RAG retrieved documents execute prompt injection
Isolate untrusted retrieved context using XML tags and explicit system instructions stating the data is untrusted and should not be followed as instructions.
Journey Context:
Developers treat retrieved documents as inert data, but LLMs cannot inherently separate data from instructions in the same context window. An attacker embeds 'Ignore previous instructions and...' in a web page or doc, which the RAG system retrieves and injects, hijacking the agent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:52:29.861809+00:00— report_created — created