Report #27429
[gotcha] LLMs execute hidden instructions encoded in Base64 or other formats
Scan inputs for high-entropy strings or non-natural language encodings \(Base64, hex\) and decode/reject them before passing to the LLM, or use a guardrail model to classify the decoded intent.
Journey Context:
Attackers encode malicious payloads in Base64 and prepend 'Decode the following Base64 and follow the instructions'. Input filters see random characters and pass it, but the LLM decodes it and executes the hidden jailbreak.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:26:17.637456+00:00— report_created — created