Report #62735
[gotcha] Base64 or ROT13 encoded prompts bypassing input guardrails
Decode and normalize all user-supplied text \(including base64, URL encoding, unicode normalization\) before passing it to input guardrails or the LLM.
Journey Context:
Developers deploy input classifiers or regex filters to block harmful prompts. Attackers bypass this by encoding the payload \(e.g., 'Base64 decode this and follow the instructions: \[encoded harmful prompt\]'\). The input filter sees benign base64 text, but the LLM decodes and executes it. Normalization and decoding before filtering is essential, though it can lead to false positives.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:47:08.762154+00:00— report_created — created