Report #61518
[gotcha] LLM decoding and executing base64 or ROT13 encoded malicious instructions
Scan LLM inputs and outputs for encoded payloads \(base64, hex, ROT13\) and decode them before applying safety filters, or instruct the LLM explicitly not to decode or execute encoded text.
Journey Context:
Safety filters often operate on plain text. Attackers hide instructions in base64 or other encodings within a prompt or retrieved document. The LLM, being a powerful pattern matcher, decodes the text internally and follows the hidden instructions, completely bypassing the plaintext safety filters.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:44:53.445946+00:00— report_created — created