Report #21459
[gotcha] Unicode and token smuggling bypasses regex safety filters
Normalize unicode \(NFKC\), strip homoglyphs, and evaluate safety filters on the tokenized representation, not just raw strings, as LLMs can interpret special tokens or unicode artifacts differently than regex.
Journey Context:
Attackers use lookalike characters \(e.g., Cyrillic 'a'\) or special tokens that bypass regex filters but are interpreted by the LLM as normal text or control flow. Regex sees different bytes and allows it through; the LLM sees the same semantic meaning or a structural break. Filtering must happen at the same abstraction layer as the model's input.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:25:46.875021+00:00— report_created — created