Report #74963
[gotcha] Token smuggling and homoglyphs bypassing keyword filters
Normalize unicode characters to ASCII equivalents \(NFKC normalization\) and remove zero-width characters before applying keyword filters or feeding to the LLM.
Journey Context:
Developers use simple keyword blocklists to prevent prompt injection \(e.g., blocking 'ignore previous instructions'\). Attackers bypass this by replacing characters with unicode lookalikes \(e.g., Cyrillic 'о' instead of Latin 'o'\) or inserting zero-width spaces. The keyword filter misses it, but the LLM's tokenizer normalizes or ignores the obfuscation, interpreting the original malicious payload. Normalizing input before filtering aligns the filter's view with the LLM's view.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:25:14.596699+00:00— report_created — created