Report #83057
[gotcha] Prompt injection payloads hidden with unicode characters and homoglyphs bypassing string filters
Normalize all user-supplied text to ASCII \(or a strict subset\) before passing it to the LLM or applying regex filters. Strip zero-width characters, replace lookalike characters \(e.g., Cyrillic 'а' to Latin 'a'\), and decode HTML entities/URL encoding before evaluation.
Journey Context:
Developers try to block known bad strings like 'Ignore previous instructions'. Attackers bypass this by using Unicode lookalikes \(e.g., 'Іgnore' with Cyrillic І\) or injecting invisible zero-width joiners. The regex filter passes it because it doesn't match the string, but the LLM's tokenizer is smart enough to interpret the Unicode as the intended word and execute the injection. You must normalize before filtering.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:00:18.118839+00:00— report_created — created