Report #52408
[gotcha] Homoglyphs and invisible unicode characters bypassing content filters
Normalize unicode and strip invisible/control characters \(like RTL override, zero-width spaces\) from user input before passing text to the LLM or safety classifiers.
Journey Context:
Attackers use characters that look identical to humans but are different to the tokenizer, or invisible characters that change tokenization, bypassing keyword filters or altering the semantic meaning parsed by the model. A filter looking for 'bomb' won't catch 'b\\u200bomb'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:27:36.849828+00:00— report_created — created