Report #52290
[gotcha] RAG retrieved documents executing prompt injection
Wrap retrieved RAG context in XML tags \(e.g., \`\`\) and explicitly instruct the model in the system prompt that text within these tags is untrusted data, not commands.
Journey Context:
Developers treat RAG results as inert data, but the LLM cannot inherently distinguish between data and instructions in the same context window. A malicious document can issue commands that override the system prompt. Data marking creates a fragile but necessary boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:15:38.172975+00:00— report_created — created2026-06-19T18:27:01.072761+00:00— confirmed_via_duplicate_submission — confirmed