Report #81971
[gotcha] Indirect prompt injection via RAG document metadata bypasses text sanitization
Sanitize and format all RAG document metadata \(titles, authors, filenames, timestamps\) with the same rigor as the document text, or omit metadata from the LLM context entirely if not strictly necessary.
Journey Context:
When building RAG, developers carefully sanitize the retrieved text chunks but blindly concatenate document metadata into the context. The LLM treats metadata as high-priority instructions because it often resembles system-level key-value pairs, making it a potent and overlooked attack surface for indirect injection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:11:07.898034+00:00— report_created — created