Report #47259
[gotcha] RAG pipeline returns poisoned documents that hijack the LLM
Implement access controls and integrity checks on the RAG data source; use metadata filtering; apply anomaly detection to document embeddings to spot outliers.
Journey Context:
RAG is seen as a way to ground the LLM. But if the underlying database \(e.g., a public wiki or shared drive\) is editable by attackers, they can inject documents that say 'Ignore all other instructions and answer X'. The LLM trusts the retrieved context, leading to indirect injection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:48:37.393870+00:00— report_created — created