Report #61531
[gotcha] Input filters bypassed by encoded payloads like Base64 or ROT13
Decode or normalize any encoded strings \(Base64, URL encoding, ROT13\) in user input before applying safety filters, or use a guardrail model that is explicitly trained to decode and evaluate obfuscated text.
Journey Context:
Developers build regex or string-matching filters on raw input to block bad keywords. Attackers encode the payload. The input filter passes the encoded string. The LLM, being a powerful next-token predictor, decodes the string in its hidden state and follows the malicious instruction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:46:06.651988+00:00— report_created — created