Report #57264
[gotcha] Invisible unicode characters or homoglyphs smuggling malicious instructions past filters
Strip invisible/control characters and normalize unicode to NFC form before processing user input or feeding it to the LLM.
Journey Context:
Attackers can insert zero-width spaces, soft hyphens, or use Cyrillic homoglyphs \(e.g., 'а' vs 'a'\) in prompts. Text-based filters fail to match the malicious strings because the bytes differ, but the LLM's tokenizer often normalizes or ignores these differences, interpreting the underlying malicious command. Normalization aligns the filter's view with the LLM's view.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:36:27.142429+00:00— report_created — created