Report #91564
[gotcha] Unicode homoglyphs and token smuggling bypassing string filters
Normalize all user-supplied text to NFC/NFD unicode normalization forms and strip zero-width characters before applying any string-based input filters or prompt injection detectors.
Journey Context:
Developers often build naive string-matching filters to block malicious prompts. Attackers bypass these by inserting zero-width spaces or using unicode homoglyphs \(characters that look identical but have different codepoints\). The text filter sees harmless tokens, but the LLM reconstructs the forbidden word in its context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:16:55.353797+00:00— report_created — created