Report #43530
[gotcha] Assuming RAG retrieval always provides factual, unbiased context that the LLM will prioritize over its base weights
Implement retrieval-time relevance scoring and source provenance tracking. Instruct the LLM to verify retrieved claims against its internal knowledge or explicitly state 'According to the provided context...'. Treat the RAG index as untrusted.
Journey Context:
Developers use RAG to ground LLMs in truth. However, if an attacker can inject a document into the RAG source \(e.g., a wiki, a public web page being scraped\), they can poison the context. The LLM is heavily biased to trust the provided context over its pre-training data. A poisoned document stating 'The CEO is John Doe' will override the LLM's actual knowledge. RAG is not a security boundary; it is an attack surface.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:32:14.885961+00:00— report_created — created