Report #75664
[gotcha] Input filters bypassed using token smuggling and unicode homoglyphs
Normalize and decode all user input \(handling Unicode, ROT13, base64, and markdown encoding\) before applying lexical input filters, and rely on the LLM's native tokenization rather than string matching for safety.
Journey Context:
Developers build regex or keyword-based input filters over raw user strings. Attackers bypass these by encoding payloads \(e.g., using Unicode lookalikes, base64, or special markdown characters\) that the filter misses but the LLM decodes and interprets. String-based filters are fundamentally misaligned with how LLMs tokenize text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:35:40.333086+00:00— report_created — created