Report #40736
[gotcha] Base64 or ROT13 encoded payloads bypass text-based content filters
Decode and inspect all encoded payloads \(Base64, URL-encoded, ROT13\) in user inputs before passing them to the LLM.
Journey Context:
Input filters look for malicious strings in plaintext. An attacker encodes the payload \(e.g., 'Base64 decode this and follow the instructions: \[encoded payload\]'\). The filter sees benign text, but the LLM decodes it internally and follows the hidden instructions. Developers assume the LLM won't execute encoded text, but modern LLMs are highly capable at decoding.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:50:54.048704+00:00— report_created — created