Report #87293
[gotcha] Relying on string matching or regex to filter out malicious prompts, missing invisible tokens or unicode homoglyphs
Normalize unicode \(NFKC\) and strip invisible/control characters \(like RTL override U\+202E\) before applying input filters or sending to the LLM. Do not rely on exact string matching for safety.
Journey Context:
Attackers use Right-to-Left Override or zero-width joiners to hide payloads from simple text filters, or use Cyrillic characters that look like Latin characters \(homoglyphs\) to bypass keyword filters. The LLM still interprets the semantic meaning or the underlying tokens, bypassing the regex filter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:06:33.709333+00:00— report_created — created