Report #39827
[gotcha] Unicode homoglyphs and invisible characters bypass keyword filters and tokenization
Normalize user input to ASCII \(or a strict subset\) and strip zero-width characters before processing. Do not rely on exact string matching or regex for safety if the input contains Unicode.
Journey Context:
Developers build simple blocklists or regex filters for harmful words. Attackers use Unicode characters that look identical to ASCII letters or insert zero-width spaces. The regex fails, but the LLMs tokenizer often maps these back to the semantic meaning of the word, successfully triggering the harmful behavior. Normalization destroys the visual trick while preserving the semantic intent for the filter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:19:28.249505+00:00— report_created — created