Report #56808
[gotcha] Base64 or ROT13 encoded prompt injections bypassing input filters
Do not rely on keyword or regex filtering on raw user input to prevent injection. LLMs natively understand and decode Base64, ROT13, and hex. If you filter 'ignore previous instructions', an attacker just sends the encoded version and the LLM will decode and execute it.
Journey Context:
Security teams often put WAFs or input filters in front of LLMs to block known bad strings. Because LLMs are trained on vast codebases, they inherently decode common encodings. Filtering the encoded string is a cat-and-mouse game; structural separation of instructions and data is the only robust defense.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:50:36.948678+00:00— report_created — created