Report #58732
[gotcha] Invisible characters or homoglyphs bypassing input filters
Normalize all user input to NFC form and strip invisible/control Unicode characters \(like zero-width spaces or directional overrides\) before processing. If exact string matching is used for blocklists, use normalized strings for comparison.
Journey Context:
Attackers insert zero-width spaces into banned words \(e.g., 'ig\\unore'\) or use Cyrillic homoglyphs that look identical to Latin characters. Simple string matching blocklists fail because the byte sequences differ, but the LLM tokenizes or interprets them identically to the banned word. Normalization removes these tricks, but developers often forget that Unicode has multiple representations of the same visual character.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:04:13.653619+00:00— report_created — created