Report #81740
[gotcha] RAG retrieved documents or tool outputs executing prompt injection
Isolate untrusted context \(RAG/tool outputs\) from system instructions using structural separation \(e.g., distinct XML tags\) and explicitly instruct the model that content within those tags is untrusted and should not be followed as instructions.
Journey Context:
Developers assume the LLM distinguishes 'instructions' from 'data', but LLMs process all tokens in the context window equally. If a retrieved document says 'ignore the above', the model might comply because it lacks inherent privilege separation. Simply putting the RAG context after the system prompt doesn't prevent this; the model just sees a longer context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:48:02.867084+00:00— report_created — created