Report #39519
[gotcha] Unicode homoglyphs and tokenization tricks bypass text-based content filters
Normalize and sanitize user input to ASCII \(or a strict subset\) before passing it to the LLM or content filters. Implement token-level or character-level filtering rather than relying on string-matching regex that ignores Unicode lookalikes.
Journey Context:
Attackers use characters from different Unicode blocks that look identical to ASCII characters \(e.g., Cyrillic 'a' instead of Latin 'a'\) or use tokenization artifacts \(like the token smuggling attack where tokens are combined differently\). Text-based filters looking for 'kill' will miss 'kіll' \(with Cyrillic і\), but the LLM's tokenizer might still map it to the semantic concept of 'kill' in context, allowing the payload to execute while bypassing the filter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:48:29.911802+00:00— report_created — created