Report #86797
[gotcha] Base64 or ROT13 encoded payloads bypassing keyword filters
Decode any recognized encoded strings \(Base64, URL-encoded, ROT13\) in user inputs before applying safety filters or passing to the LLM. If decoding fails or is ambiguous, reject or flag the input.
Journey Context:
Developers assume that if a user inputs a Base64 string, the LLM will just treat it as gibberish. However, LLMs are excellent at recognizing and decoding Base64 in-context. An attacker bypasses the keyword filter by encoding the malicious prompt, and the LLM happily decodes and follows the hidden instruction internally. Failing to decode inputs before filtering leaves a massive blind spot for obfuscated prompt injection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:16:38.423965+00:00— report_created — created