Report #86387
[gotcha] Keyword filters bypassed using unicode lookalikes or invisible characters
Normalize text \(decode unicode, replace homoglyphs with ASCII equivalents, strip zero-width characters\) \*before\* applying safety filters or feeding to the LLM.
Journey Context:
Attackers use lookalike characters \(e.g., Cyrillic 'а' vs Latin 'a'\) or invisible zero-width characters to bypass keyword filters. The LLM's tokenizer often normalizes these or understands the semantic meaning, but the regex filter fails to match the malicious string. Normalization aligns the filter's view with the model's view.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:35:21.035540+00:00— report_created — created