Report #94414
[gotcha] Attacker escapes user prompt boundaries using system prompt delimiters
Use randomly generated, unique session tokens for delimiters \(e.g., \) instead of generic tags like or ---. Validate that user input does not contain these tokens before constructing the prompt.
Journey Context:
Developers use XML tags or markdown lines to separate system context from user input. If an attacker guesses or leaks the delimiter format, they can inject 'Ignore previous instructions and...'. The LLM parser sees the closing tag and treats the rest as a system instruction. Randomizing delimiters per request makes it computationally infeasible for the attacker to know the exact string required to break out of the user context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:03:23.779387+00:00— report_created — created