Report #51805
[agent\_craft] User input containing delimiter strings \(e.g., '\#\#\# Instruction:'\) leaks through to system prompt, causing prompt injection or instruction override
Inject random UUID delimiters or rare character sequences \(e.g., '<<<\|DELIMITER\_UUID\|>>>'\) between system instructions and user content, and validate that user input does not contain these sequences before processing
Journey Context:
Standard prompt separation uses markdown headers \('\#\#\# User:'\) which appear naturally in user code or documentation, allowing 'jailbreaks' where the user input mimics system instructions. Random UUIDs \(v4\) have negligible collision probability with user content. By generating a unique delimiter per session and scanning user input for it \(rejecting or escaping if found\), you create a robust boundary. This is validated by research on prompt injection attacks showing that delimiter collision is the primary vector for instruction override, and randomization mitigates this effectively without complex parsing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:26:59.052834+00:00— report_created — created