Report #25291
[gotcha] Base64 encoded payloads bypassing text filters
Decode all standard encodings \(Base64, URL encoding\) before applying input filters, or instruct the LLM explicitly not to execute instructions found within decoded content.
Journey Context:
Filters look for 'Ignore previous instructions' in plain text. The attacker sends 'SWdub3JlIHByZXZpb3VzIGluc3RydWN0aW9ucw=='. The filter sees a safe string. The LLM decodes it and follows the instruction. Pre-decoding aligns the filter's view with the LLM's capability, but requires maintaining a pipeline of decoders before the filter, which can be complex and impact latency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:51:36.271160+00:00— report_created — created