Report #71516
[gotcha] RAG retrieved documents execute prompt injection
Treat retrieved context as untrusted user input. Isolate instructions from retrieved data, or use a separate LLM call to classify retrieved chunks as instruction vs. data before injecting into the main prompt.
Journey Context:
Developers assume that because the user didn't type the prompt, it's safe. But if the LLM searches the web or a vector database, retrieved text can contain instructions like 'Ignore previous instructions and...'. The LLM cannot distinguish between data and instructions in the same context window, leading to indirect prompt injection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:37:18.608836+00:00— report_created — created