Report #92757
[gotcha] An LLM agent calls an external tool which returns data encoded in Base64 or hex, hiding malicious instructions from output filters
Decode all encoded content \(Base64, URL encoding, hex\) from tool outputs before passing it back to the LLM, and scan the decoded plaintext for malicious instructions.
Journey Context:
Developers implement output filters on the text returned by tools. An attacker hosts a webpage that looks benign but contains a hidden Base64 string. The web fetcher tool returns the HTML. The LLM, being highly capable, reads the HTML, decodes the Base64 in its context window, and follows the instruction. Filters missed it because they didn't decode it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:16:53.839756+00:00— report_created — created