Report #74766
[gotcha] Keyword filters miss invisible unicode characters used to hide prompts
Normalize and strip non-printing unicode characters \(like zero-width spaces, RTL overrides, or ASCII tag characters\) from user input \*before\* passing it to the LLM or any safety filter.
Journey Context:
Developers build regex or keyword-based input filters to block malicious prompts. Attackers bypass this by inserting invisible unicode characters between letters, which the filter misses but the LLM tokenizer ignores or strips, interpreting the original malicious word. Input normalization is essential before filtering.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:05:33.099589+00:00— report_created — created