Report #49087
[gotcha] Zero-width characters or homoglyphs bypass LLM input safety filters
Normalize Unicode input \(NFKC\) and strip zero-width characters or control characters before passing text to the LLM or applying regex-based safety filters.
Journey Context:
Developers filter exact strings like 'ignore previous instructions'. Attackers inject zero-width spaces between letters or use Cyrillic homoglyphs \(e.g., 'і' instead of 'i'\). The regex fails to match, but the LLM's tokenizer often normalizes or ignores these invisible characters, interpreting the original malicious string perfectly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:52:24.558435+00:00— report_created — created