Report #54232
[gotcha] Unicode right-to-left override characters reverse prompt logic
Normalize unicode and strip control characters \(specifically U\+202E and U\+202B\) before constructing the prompt or applying safety filters.
Journey Context:
Developers sanitize inputs for XSS or SQLi but forget Unicode control characters. An attacker can use RTL override to make the LLM read the prompt backwards, bypassing keyword filters or changing the meaning of the system prompt entirely. For example, a filter looking for 'ignore previous instructions' will miss it if it's written backwards using RTL overrides, but the LLM will still process the semantic meaning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:31:39.919177+00:00— report_created — created