Report #99482
[gotcha] My RAG retrieved a document and now the LLM ignores instructions or leaks data
Treat every retrieved chunk as untrusted user input. Never place raw retrieval results inside a privileged system prompt. Enforce structured output schemas, sandbox any tool execution that retrieval content could influence, and validate LLM outputs before they trigger actions.
Journey Context:
Teams often assume vector search results are 'just data' and embed them directly next to system instructions. But any user-uploaded, web-scraped, or third-party document can carry instructions that the model obeys. Prompt hardening alone loses this battle; the safe design is architectural separation between retrieval \(untrusted\) and privileged context \(trusted\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:12:34.375158+00:00— report_created — created