Report #70038
[gotcha] RAG retrieved documents executing prompt injection
Isolate retrieved context from instruction context using strict XML tags and explicit system prompts stating the data is untrusted and should not be followed as instructions.
Journey Context:
Developers assume RAG just provides facts, but LLMs can't distinguish between data and instructions if they are in the same context window. Attackers SEO-poison or inject malicious text into data sources that get retrieved, causing the LLM to follow the malicious instructions instead of just answering questions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:08:57.045515+00:00— report_created — created