Report #94157
[gotcha] Indirect prompt injection through RAG document metadata
Strip or strictly sanitize document metadata \(titles, authors, source URLs, custom tags\) before embedding it in the LLM context, treating it with the same distrust as the document body.
Journey Context:
When building RAG systems, developers often concatenate the document chunk with its metadata \(e.g., Source: \{url\}\\nTitle: \{title\}\\nBody: \{chunk\}\) to give the LLM context. They sanitize the body text but forget that metadata fields like title or author are user-controllable \(e.g., a maliciously titled PDF\). The LLM processes the metadata as instructions, and because metadata is often placed at the beginning of the chunk, it acts as a strong prompt injection vector.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:37:51.489758+00:00— report_created — created