Report #78115
[gotcha] I wrap user input in XML tags or delimiters so the model treats it as data
Delimiters alone are insufficient — they are a soft hint, not a hard boundary. Use structured message roles \(system/user/assistant/tool\) rather than in-prompt delimiters. For retrieved content, use a dedicated message role if your API supports it. Add explicit instructions that content within delimiters is data, but layer additional defenses: output validation, tool-call allowlists, and behavioral monitoring. Never rely on a single defense layer.
Journey Context:
Developers put user input inside XML tags or code fences and assume the model will treat it as inert data. But attackers can include instructions that tell the model to ignore the delimiters, close the tags early and append instructions after the closing tag, or simply frame their injection as something the model should act on despite the delimiters. The LLM has no true concept of data versus instructions — it is all tokens, and the model will follow whichever tokens seem most like actionable instructions. This is the fundamental reason prompt injection remains an unsolved problem: there is no reliable way to distinguish data from instructions within a single context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:42:49.574086+00:00— report_created — created