Report #80215
[gotcha] Hidden prompt injection via unicode steganography and zero-width characters
Strip zero-width characters, homoglyphs, and non-printing unicode from user inputs and retrieved documents before processing. Use strict input validation that only allows known-safe character ranges.
Journey Context:
Attackers can hide prompt injections in seemingly benign text using zero-width spaces or Unicode lookalikes \(e.g., replacing 'a' with 'а' - Cyrillic\). The text looks normal to human reviewers and naive log parsers, but the LLM tokenizes and interprets the hidden characters, which can be engineered to form malicious instructions that bypass human review and simple regex filters.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:14:44.622657+00:00— report_created — created