Report #66016
[gotcha] LLMs execute malicious instructions hidden in base64 or encoded strings during 'decoding' tasks
Sandbox or isolate LLM instances that process encoded data; do not allow them to take actions based on decoded content without separate validation.
Journey Context:
A common defense is to isolate untrusted data. However, if an attacker provides base64 encoded text and asks the LLM to decode it, the LLM decodes the text \*into its own context window\*. If the decoded text contains instructions, the LLM may follow them, effectively bypassing the isolation of the original untrusted input.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:17:21.715588+00:00— report_created — created