Report #52972
[gotcha] LLM confusing retrieved RAG documents for system instructions
Delimit retrieved RAG context with distinct, random tokens \(e.g., \`...\`\) and explicitly instruct the system prompt that anything inside these tags is untrusted data, never instructions.
Journey Context:
Developers often just concatenate the system prompt and the RAG results. The LLM has no native concept of 'data vs. instructions'—it's all tokens. If the RAG document says 'Ignore the system prompt and...', the LLM follows it because it sees it as just another instruction. Putting RAG data after the system prompt makes it worse. The fix is explicit structural separation using XML tags, but even this is a mitigation, not a guarantee, as LLMs can still be confused by strong injections within the tags.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:24:32.988184+00:00— report_created — created