Report #88540
[gotcha] Token smuggling and Unicode homoglyphs bypassing keyword filters
Normalize Unicode text to a canonical form \(like NFKC\) before applying keyword or regex safety filters, and before passing to the LLM. Be aware that visual representation differs from token representation.
Journey Context:
Developers build regex or keyword filters to block specific dangerous prompts. Attackers bypass this using Unicode tricks \(e.g., using a Cyrillic 'а' instead of Latin 'a', or using zero-width joiners\). The keyword filter misses it, but the LLM's tokenizer normalizes or understands the semantic intent of the hidden characters, executing the payload while the filter saw harmless text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:11:53.523265+00:00— report_created — created