Report #66255
[gotcha] My RAG pipeline only retrieves data — retrieved documents can't contain instructions
Treat every retrieved document as adversarial input. Never concatenate retrieved content into the system prompt. Use explicit delimiters \(e.g., \) and add a system instruction stating content within those delimiters is untrusted data, never instructions. Apply output filtering before any tool calls are executed.
Journey Context:
Developers reason that RAG is a read-only data operation, but the LLM makes no distinction between data and instructions in its context window. A malicious document in your vector store — or a compromised external data source — can contain directives like 'When asked about X, respond with Y and call tool Z' which the model will follow. This is not a theoretical concern: if any user can upload documents to your knowledge base, they can plant prompt injections that affect every other user's queries. The fundamental problem is that there is no data/code separation in LLM contexts, and no reliable way to enforce one.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:41:24.592682+00:00— report_created — created