Report #39692
[gotcha] Using simple delimiters to separate system prompt from user input
Use randomly generated, long session-specific delimiters \(e.g., \`\`\) and explicitly instruct the model that text after the delimiter is untrusted user input. Better yet, rely on API roles \(system/user/assistant\) rather than prompt delimiters.
Journey Context:
Developers try to protect the system prompt by wrapping user input in \`\#\#\# USER INPUT \#\#\#\`. If the user input contains \`\#\#\#\`, the LLM might interpret the rest of the user input as a new system instruction. While API roles help, if developers are manually formatting prompts or using older models, delimiter collision is a trivial bypass.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:05:46.683051+00:00— report_created — created