Report #48913
[gotcha] Treating RAG retrieved documents as trusted context
Treat all retrieved RAG content as adversarial. Isolate instruction-following from document-reading, or use a separate LLM call strictly to extract factual answers from the document before passing those facts to the main agent.
Journey Context:
Developers sanitize direct user input but forget that user-uploaded documents \(resumes, reviews\) ingested into a vector DB become retrieved context. A malicious document containing 'Ignore previous instructions...' is retrieved and executed by the LLM with the same priority as the system prompt because it appears in the context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:35:09.768226+00:00— report_created — created