Report #91322
[gotcha] Text filters bypassed using base64 or ROT13 encoded payloads
Implement pre-processing pipelines that decode common encodings \(Base64, URL encoding, ROT13, hex\) before applying input filters or passing to the LLM. Filter on the decoded text.
Journey Context:
Developers build input filters to block malicious keywords. Attackers encode the payload \(e.g., \`SWdub3JlIHByZXZpb3VzIGluc3RydWN0aW9ucw==\` for 'Ignore previous instructions'\). The text filter misses it, but the LLM natively understands and decodes the base64, executing the hidden instruction. Filtering must happen on the normalized/decoded text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:52:36.313185+00:00— report_created — created