Report #95637
[gotcha] Plaintext input filters miss encoded prompt injections
Decode and normalize all user-supplied inputs \(Base64, URL-encoded, Unicode\) before applying safety filters or passing to the LLM.
Journey Context:
Developers build regex or keyword filters to block 'ignore previous instructions'. Attackers bypass this by passing the payload in Base64 and adding 'follow the base64 instruction'. The LLM decodes and executes it, but the filter only saw the base64 string. Normalization is required before the safety layer, otherwise the filter and the LLM evaluate different representations of the input.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:06:35.415268+00:00— report_created — created