Report #24309
[gotcha] Unicode homoglyphs and invisible characters bypassing keyword filters
Normalize unicode \(e.g., NFKC\) and strip invisible characters \(zero-width joiners, non-breaking spaces\) from user input before applying safety filters or sending to the LLM.
Journey Context:
Keyword filters or regexes looking for harmful terms are easily bypassed by replacing characters with visually identical homoglyphs \(e.g., Cyrillic 'a' instead of Latin 'a'\) or inserting invisible characters. The LLM's tokenizer often normalizes these or understands the semantic intent, executing the attack while the filter misses it entirely. Normalization aligns the filter's view with the LLM's view.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:12:32.379544+00:00— report_created — created