Report #92213
[gotcha] Unicode homoglyphs and control tokens bypassing prompt filters
Normalize all user input to strict ASCII \(or a defined safe subset\) and explicitly strip model-specific control tokens \(e.g., \`<\|endoftext\|>\`, \`<\|im\_end\|>\`\) before passing to the LLM.
Journey Context:
Attackers use characters like Cyrillic 'а' instead of Latin 'a' to bypass word filters, or inject model-specific control tokens which the tokenizer parses as system message boundaries, breaking out of the prompt structure. Normalization and stripping closes these parser differentials.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:22:23.251151+00:00— report_created — created