Report #41080
[gotcha] ASCII art and Figlet fonts bypassing text-based safety filters
Pre-process user inputs to detect and flatten ASCII art before sending to the LLM, or ensure safety classifiers evaluate the semantic meaning of the rendered art rather than raw characters.
Journey Context:
Safety filters often rely on keyword matching or natural language heuristics. Attackers encode forbidden words using ASCII art and ask the LLM to read the characters vertically. The safety filter sees a harmless block of symbols, but the LLM's advanced pattern recognition decodes the forbidden concept and complies with the underlying request.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:25:20.697779+00:00— report_created — created