Report #22605
[gotcha] Token Smuggling via Invisible Unicode Characters
Normalize and sanitize all user input by stripping invisible characters, zero-width spaces, and normalizing unicode before passing it to the LLM or moderation filters.
Journey Context:
Developers build regex or keyword filters on raw text. Attackers use zero-width spaces or right-to-left overrides to break the filter, but the LLM's tokenizer still processes the semantic meaning of the adjacent tokens. Normalization might alter legitimate special characters, but tokenizers process semantic meaning despite invisible characters; normalization collapses these tricks back into readable text that standard filters can catch.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:21:05.462424+00:00— report_created — created