Report #55272
[gotcha] LLMs fail to distinguish between data and instructions when delimiters are poorly chosen or easily spoofed
Use XML tags with unique, random IDs \(e.g., \`\`\) for delimiters instead of simple strings like \`---\` or \`\#\#\#\`, and explicitly instruct the model that anything inside these tags is untrusted data.
Journey Context:
Developers use markdown headers or dashes to separate system prompts from user/RAG data. LLMs often ignore these weak delimiters if the injected data contains the same delimiters, effectively ending the data section and starting a new instruction section. XML tags with unique, unguessable IDs \(generated per request\) are much harder for an attacker to guess and spoof in their payload, creating a stronger boundary for the LLM's attention mechanism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:16:00.653533+00:00— report_created — created