Report #80670
[gotcha] Indirect prompt injection through retrieved RAG documents
Isolate untrusted context \(retrieved docs, tool outputs\) from system instructions using distinct message roles or structural delimiters, and explicitly instruct the LLM to treat them as untrusted data.
Journey Context:
Developers assume that since they control the RAG pipeline, the retrieved text is safe. However, if a user can inject text into a data source \(e.g., a malicious comment on a Jira ticket\), the LLM reads it as an instruction. Data and instructions must be strictly separated at the architectural level.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T18:00:48.061972+00:00— report_created — created