Report #96474
[gotcha] Invisible unicode characters or homoglyphs bypass prompt filters
Normalize unicode \(e.g., NFKC\) and strip invisible/control characters \(like RTL override\) before applying safety filters or feeding to the LLM.
Journey Context:
Developers apply regex or string matching for bad words. Attackers use zero-width spaces or Cyrillic homoglyphs \(e.g., 'а' vs 'a'\). The LLM processes the semantic meaning of the normalized text, bypassing the naive string filter. Normalization is required but often forgotten.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:30:51.883876+00:00— report_created — created