Report #45783
[gotcha] Invisible unicode characters and homoglyphs bypass text filters
Normalize and sanitize all LLM inputs and outputs to strip invisible characters, zero-width joiners, and replace homoglyphs with their standard ASCII equivalents before processing or filtering.
Journey Context:
Developers often apply regex or keyword-based filters to block malicious prompts. Attackers bypass these by using unicode tricks: inserting zero-width spaces between characters \(e.g., 'p r o m p t'\), using lookalike characters \(homoglyphs like Cyrillic 'a' instead of Latin 'a'\), or using tag tokens \(like the \`<\|endofprompt\|>\` token in some tokenizers\). The text filter sees benign or broken strings, but the LLM's tokenizer seamlessly decodes them into the intended malicious prompt. Normalization destroys these covert channels.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:19:20.697232+00:00— report_created — created