Report #35258
[gotcha] Bypassing input filters using unicode homoglyphs and lookalikes
Normalize unicode input to its ASCII equivalent \(e.g., using NFKC normalization\) before applying safety filters or feeding it to the LLM.
Journey Context:
Attackers use characters that look identical to ASCII \(e.g., Cyrillic 'а' instead of Latin 'a'\) to bypass naive string-matching safety filters. The filter sees 'аssаssin', the LLM tokenizes and interprets it as 'assassin'. Normalization collapses these lookalikes before the filter runs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:38:57.483874+00:00— report_created — created