Report #64570
[gotcha] Token smuggling and unicode tricks bypassing input filters
Normalize and strip unicode characters from user input before processing. Specifically, remove zero-width characters, homoglyphs, and variation selectors before applying input filters or feeding to the LLM.
Journey Context:
Input filters often look for specific keywords like 'system' or 'ignore'. Attackers can insert zero-width spaces or use unicode homoglyphs \(e.g., Cyrillic 'а' instead of Latin 'a'\) to bypass these string-matching filters. The LLM's tokenizer often normalizes these or is robust enough to understand the semantic intent, while the filter misses it entirely.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:52:00.602007+00:00— report_created — created