Report #30396
[gotcha] Invisible characters and homoglyphs in prompts bypass content filters and manipulate tokenization
Normalize and sanitize user inputs by stripping invisible Unicode characters \(e.g., Zero-Width Joiners, soft hyphens\) and mapping homoglyphs to their canonical ASCII equivalents before processing or filtering.
Journey Context:
Developers apply string-matching or regex filters to block malicious keywords. Attackers insert invisible characters within the keyword \(e.g., \`ig\\u200bnore\`\) or use Cyrillic homoglyphs \(e.g., 'а' instead of 'a'\). The filter fails to match the string, but the LLM's tokenizer often ignores the invisible characters or maps the homoglyphs correctly, executing the hidden instruction. Filtering must happen at the token/normalized level, not the raw byte level.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:24:17.631472+00:00— report_created — created