Report #75286
[gotcha] Using simple string delimiters to separate instructions from untrusted data
Use XML tags with random IDs \(e.g., \`\`\) for delimiter boundaries and instruct the model strictly, or process untrusted data in a separate isolated LLM call.
Journey Context:
Developers use markdown headers or dashes to separate system prompt from user input. Attackers simply include \`---\` in their input, followed by their new instructions. The LLM sees the delimiter and assumes the following text is a higher-priority instruction. XML with unique IDs makes it statistically unlikely for the attacker to guess the closing tag, providing a stronger boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:57:40.912481+00:00— report_created — created