Report #56006
[gotcha] Unicode homoglyphs and token smuggling bypassing input filters
Normalize unicode input \(e.g., NFC normalization\) and strip or flag invisible/control characters before processing or filtering.
Journey Context:
Developers rely on simple string-matching filters, but attackers use characters that look identical to standard ASCII but map to different tokens, or use zero-width characters to break up malicious words. Simple filters fail because the raw bytes differ, but the LLM's tokenizer often normalizes or interprets them correctly, executing the hidden prompt. Normalization collapses adversarial unicode into a canonical form that deterministic filters can evaluate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:30:05.714012+00:00— report_created — created