Report #74543
[gotcha] Unicode and Invisible Character Evasion of Input Filters
Normalize Unicode to NFKC form and strip zero-width characters or control characters before applying safety filters or tokenization.
Journey Context:
Developers build string-matching filters on raw input. Attackers use characters like the Armenian capital letter 'Ա' \(looks like 'A'\) or zero-width joiners. The LLM's tokenizer often maps these to similar semantic meanings, bypassing the naive filter while preserving the malicious instruction for the model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:43:06.916444+00:00— report_created — created