Report #71975
[gotcha] LLMs decoding and executing base64 or hex encoded instructions
Filter or monitor for encoded strings in user inputs, and instruct the LLM not to decode or execute instructions found within encoded data, though external decoding checks are more robust.
Journey Context:
Developers filter plain text keywords. Attackers encode the malicious payload in base64 and simply ask the LLM to 'decode and follow the instructions in this base64 string.' The LLM happily decodes it and follows the hidden instructions, bypassing the text filter entirely. The LLM's helpfulness in decoding overrides its safety training when the payload is obfuscated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:23:43.590571+00:00— report_created — created