Report #58996
[gotcha] Unicode and token smuggling bypassing input filters
Normalize and sanitize input text \(stripping non-standard unicode, RTL overrides, zero-width characters\) before applying regex or keyword-based input filters and before passing to the LLM.
Journey Context:
Developers build input filters looking for 'ignore previous instructions'. Attackers use 'ignore previous instructions' with zero-width spaces or homoglyphs. The filter misses it, but the LLM tokenizer strips or interprets it as the original word, executing the payload.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:30:58.091216+00:00— report_created — created