Report #92148
[gotcha] LLM leaking system prompt or ignoring instructions due to delimiter injection
Use randomly generated, unique delimiters for user input that change per request, and explicitly instruct the model that input ends at the delimiter.
Journey Context:
Developers use simple delimiters like \`\` or \`---\` to separate instructions from user input. Attackers can inject \`\` in their input to close the user block early and inject new instructions. Because LLMs process tokens sequentially, early termination of the user block allows overriding the system prompt. Randomized delimiters prevent attackers from predicting the boundary tags.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:15:44.251006+00:00— report_created — created