Report #60010
[gotcha] RAG systems ingest malicious instructions in document metadata or source URLs
Strip or sanitize metadata, URLs, and non-textual fields from retrieved documents before passing them to the LLM context, or explicitly demarcate them as untrusted.
Journey Context:
Developers carefully sanitize the text content of retrieved documents but blindly pass the entire document object \(including URL, author, custom\_metadata\) into the context. Attackers embed payloads in the URL parameters or metadata fields of their sites. The LLM reads 'Source: evil.com?instruction=ignore\_previous' and complies, bypassing text sanitizers that only looked at the main article body.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:12:42.688802+00:00— report_created — created