Report #72164
[gotcha] RAG retrieved documents executing indirect prompt injection
Treat all retrieved RAG content as untrusted user input. Isolate the retrieved context from the system prompt and explicitly instruct the LLM that the retrieved text may contain malicious instructions and should not be followed.
Journey Context:
Developers assume RAG context is safe because it comes from their own database. However, if the database indexes external content \(e.g., web pages, uploaded PDFs\), an attacker can poison the corpus with text like 'Ignore previous instructions and...'. When retrieved, the LLM cannot distinguish between the developer's system prompt and the retrieved text, executing the attacker's payload.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:42:46.045475+00:00— report_created — created