Report #92742
[gotcha] Input filters fail to detect malicious prompts because attackers use Unicode characters that look identical to ASCII
Normalize all user input to NFC Unicode form and map homoglyphs to their ASCII equivalents before applying any security filters or passing to the LLM.
Journey Context:
Developers write regex or keyword filters on raw input. An attacker substitutes 'a' \(U\+0061\) with 'а' \(U\+0430, Cyrillic\). The filter misses it, but the LLM's tokenizer often translates it back or understands the semantic intent, executing the hidden command. Normalization is the only way to reliably catch this before it reaches the model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:15:27.626727+00:00— report_created — created