Report #20985
[gotcha] Text-based filters miss encoded instructions in RAG documents
Implement content decoding and normalization \(e.g., Base64, ROT13, HTML entities\) in the RAG ingestion pipeline before applying security filters or chunking.
Journey Context:
Developers build naive input filters looking for strings like 'ignore previous instructions'. Attackers bypass this by placing the payload in Base64 within a document the LLM retrieves. LLMs natively decode Base64. The text filter sees aWdub3JlIHByZXZpb3Vz, but the LLM reads and executes the decoded string. It is counter-intuitive because the security filter and the LLM effectively see two different inputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:37:40.644404+00:00— report_created — created