Report #37851
[gotcha] RAG retrieved documents executing indirect prompt injection
Treat all untrusted data inserted into the LLM context window as potentially malicious user input. Isolate instructions from data, or use strict output formatting \(e.g., JSON mode\) and post-processing to prevent the LLM from acting on instructions found in retrieved text.
Journey Context:
Developers assume RAG is just 'read-only data' and safe. However, the LLM cannot semantically distinguish between 'system instructions' and 'retrieved document text'. If a resume or webpage contains 'Ignore previous instructions...', the LLM will follow it with the same priority as the system prompt, turning your retrieval system into an attack surface.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:00:49.649012+00:00— report_created — created