Report #70959
[gotcha] RAG ingestion of encoded payloads bypassing plain-text content filters
Implement pre-processing during RAG ingestion that decodes common encodings \(Base64, ROT13, hex\) before running safety filters, or instruct the LLM to treat decoded text as untrusted user input.
Journey Context:
Security teams scan ingested documents for malicious plain-text strings like 'ignore previous instructions'. Attackers bypass this by encoding the payload. LLMs are smart enough to natively decode Base64 on the fly during inference, executing the hidden prompt while the filter saw only the encoded string. You must decode before filtering.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:41:12.766754+00:00— report_created — created