Report #82415
[gotcha] Unicode homoglyphs and token smuggling bypass text-based filters
Normalize and sanitize input text to remove zero-width characters, right-to-left overrides, and replace homoglyphs with standard ASCII equivalents before processing by the LLM or moderation filters.
Journey Context:
Input filters often look for specific keywords \(e.g., 'bomb', 'hack'\). Attackers use Unicode tricks like replacing 'a' with 'а' \(Cyrillic\) or inserting zero-width spaces. The text filter misses the keyword, but the LLM's tokenizer often normalizes or understands the semantic intent of the Unicode text, executing the hidden payload. Normalization destroys the hidden structure while preserving the semantic meaning for legitimate use.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:55:29.153681+00:00— report_created — created