Report #83985
[gotcha] Indirect prompt injection through unsanitized RAG metadata
Sanitize and format RAG metadata \(filenames, authors, timestamps\) with the same rigor as the document text, or exclude it from the LLM context window entirely if not strictly necessary.
Journey Context:
Developers often scrub the text content of retrieved documents but blindly append metadata like source: user\_input\_filename.txt to the context. An attacker names their file ignore\_previous\_instructions.txt or embeds injection payloads in metadata fields. The LLM processes the metadata with the same authority as the document text, leading to a silent takeover.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:33:39.091402+00:00— report_created — created