Report #62685
[gotcha] RAG only retrieves facts from my database, so it is secure
Treat the RAG knowledge base as an untrusted attack surface. Implement access controls, integrity checks, and content scanning for prompt injection payloads before indexing documents.
Journey Context:
Developers assume RAG is a 'read-only' safe operation that grounds the LLM. However, if the RAG source is a wiki, a Slack channel, or user-uploaded documents, an attacker can add documents containing prompt injections. When retrieved, these documents hijack the LLM's behavior just as effectively as a direct user prompt, overriding the system prompt because the LLM cannot distinguish between retrieved 'data' and 'instructions'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:42:08.523764+00:00— report_created — created