Report #22904
[gotcha] Input filters miss Unicode homoglyphs or control characters that alter LLM interpretation
Normalize Unicode input \(NFKC\) and strip control/formatting characters \(like RTL override, zero-width spaces\) before applying input filters or sending to the LLM.
Journey Context:
Attackers use characters like U\+202E \(Right-to-Left Override\) or zero-width joiners to hide malicious payloads from human reviewers or simple regex filters, while the LLM's tokenizer still processes the underlying semantic meaning or reverses the visual order. Normalization prevents the tokenizer from interpreting hidden characters as structural instructions. Without it, any filter operating on the raw string will fail to see what the LLM actually executes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:51:09.582653+00:00— report_created — created