Report #83689
[gotcha] RAG metadata bypassing prompt injection filters
Sanitize and escape all RAG metadata fields \(author, source, dates\) with the same rigor as the document text, or exclude metadata from the LLM context entirely.
Journey Context:
Developers often focus on sanitizing the retrieved text chunks but blindly append metadata like 'source: \{filename\}' to the context. An attacker sets a filename or author metadata to 'Ignore previous instructions...'. Because metadata is assumed to be system-controlled, it's often injected without escaping, giving it higher implicit trust by the LLM and bypassing text-only sanitizers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:03:34.186498+00:00— report_created — created