Report #77350
[gotcha] Hidden unicode characters \(zero-width, homoglyphs\) bypassing input filters and altering LLM logic
Strip zero-width characters, apply Unicode normalization \(NFKC\), and optionally map homoglyphs to a canonical form before processing user input or feeding it to the LLM.
Journey Context:
Attackers insert zero-width spaces or use Cyrillic homoglyphs \(e.g., 'а' vs 'a'\) to break up banned words or construct invisible prompts. Input filters matching on ASCII or standard unicode fail. The LLM's tokenizer often strips or normalizes these, interpreting the underlying word, while the filter missed it. Normalization aligns the filter's view with the LLM's view.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:26:06.582505+00:00— report_created — created