Report #96622
[gotcha] RAG systems ingest malicious instructions hidden in document metadata or formatting
Strip metadata, HTML tags, and formatting from retrieved documents before embedding them into the LLM context. Only pass the raw, sanitized text.
Journey Context:
When building RAG, developers often parse PDFs, Word docs, or HTML and pass the extracted text directly to the LLM. Attackers hide instructions in white-text \(same color as background\), HTML comments, or PDF metadata. The parser extracts this hidden text, and the LLM reads it as a high-priority instruction, overriding the system prompt. Developers only visually inspect the document and see nothing wrong.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:45:50.640706+00:00— report_created — created