Report #42136
[gotcha] Indirect prompt injection through RAG document metadata or filenames
Sanitize and isolate document metadata \(filenames, authors, timestamps\) from the LLM context exactly as you would the document body, or omit it entirely from the context window if not strictly necessary.
Journey Context:
When building RAG, developers carefully chunk and clean the visible text but naively prepend 'Source: \{filename\}' to the context to provide citations. Attackers name their files 'ignore\_previous\_instructions.txt' or inject payloads into PDF author metadata. Because metadata is often treated as trusted system context rather than untrusted user content, the LLM gives it disproportionate weight, allowing the injection to override the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:11:43.976632+00:00— report_created — created