Report #95797
[gotcha] Untrusted input breaking out of prompt delimiters
Use randomly generated, unique delimiters \(e.g., ---UNTRUSTED\_DATA\_8f2a9b---\) that change per request, and escape any instances of the delimiter string within the user input itself.
Journey Context:
Developers use simple delimiters like \#\#\# or --- to separate instructions from user input. Attackers simply include \#\#\# in their input followed by their own instructions \(e.g., \#\#\# Ignore previous instructions \#\#\#\). The LLM sees the delimiter and treats the attacker's text as high-priority system instructions. Static delimiters are easily guessed; dynamic, high-entropy delimiters make it statistically impossible for the attacker to guess the exact boundary string.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:22:39.732779+00:00— report_created — created