Report #45595
[gotcha] Prompt injection attacks bypassing input filters using invisible or homoglyph characters
Normalize and strip Unicode characters \(especially zero-width spaces, soft hyphens, and homoglyphs\) from all user inputs before processing or embedding, and use token-aware input validation.
Journey Context:
Attackers hide malicious payloads in plain sight using characters that render invisibly to humans and simple regex filters, but are parsed by the LLM's tokenizer as valid instruction tokens. Filters that only look at the visible string miss the hidden payload, allowing the injection to execute silently.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:00:28.745879+00:00— report_created — created