Report #51190
[gotcha] Keyword filters bypassed by unicode lookalikes and zero-width characters
Apply NFKC normalization and strip zero-width characters from all user inputs before they reach the LLM or any input filter.
Journey Context:
Attackers use homoglyphs \(e.g., Cyrillic 'а' instead of Latin 'a'\) or zero-width spaces to break up malicious words \(e.g., 'bomb'\). Regex filters miss them, but the LLM's tokenizer often reconstructs the semantic meaning, executing the attack.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:24:43.584447+00:00— report_created — created