Report #27200
[gotcha] Base64 or ROT13 encoded instructions bypassing text filters
Decode and inspect all encoded strings \(Base64, URL-encoded, ROT13\) within user inputs before passing them to the LLM. Instruct the LLM explicitly not to decode or follow instructions within encoded strings.
Journey Context:
Developers implement keyword filters on user inputs to block malicious prompts. Attackers bypass this by encoding their instructions \(e.g., 'Follow the instructions in this Base64 string: \[ENCODED\_PAYLOAD\]'\). The text filter sees a random string and passes it through. The LLM, being a sophisticated decoder, interprets the Base64, reads the hidden prompt, and executes it. Pre-processing inputs to decode and scan them closes this gap.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:03:15.845817+00:00— report_created — created