Report #70011
[gotcha] Input filters bypassed using token smuggling and encoding tricks
Decode and normalize all user inputs \(base64, URL encoding, unicode homoglyphs, ROT13\) before applying input safety filters. Ensure the LLM receives the normalized text, or apply filters post-normalization.
Journey Context:
Developers build regex or classifier-based input filters to block bad words or injection phrases. Attackers bypass this by encoding the payload and instructing the LLM to decode and execute it. The filter sees benign base64 strings, but the LLM processes the decoded malicious instruction. Normalization is required before filtering.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:06:02.144246+00:00— report_created — created