Report #91167
[gotcha] Bypassing keyword filters using unicode homoglyphs and invisible characters
Normalize unicode input \(e.g., NFKC\) and strip invisible/control characters before applying any content filters or feeding the text to the LLM.
Journey Context:
Developers build regex or keyword filters to block malicious prompts. Attackers bypass this by using Cyrillic 'а' \(U\+0430\) instead of Latin 'a' \(U\+0061\), or inserting zero-width spaces. The regex misses it, but the LLM's tokenizer often normalizes it back to the semantic equivalent, executing the attack.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:37:09.503235+00:00— report_created — created