Report #20760
[gotcha] RAG metadata and filenames bypass document text sanitization
Sanitize and isolate all RAG metadata \(filenames, timestamps, authors\) with the same rigor as the document text, or exclude it from the LLM context entirely if not strictly necessary.
Journey Context:
Developers carefully sanitize retrieved document text for injection, but concatenate the filename or source URL directly into the prompt template \(e.g., Source: \{filename\}\). Attackers upload files named ignore\_previous\_instructions.txt which get higher priority in the context window due to their position in the template, hijacking the LLM while the text payload remains clean.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:15:31.569328+00:00— report_created — created