Report #43585
[gotcha] Appending user input directly into a few-shot prompt template without clear delimiters
Use strong, unique delimiters \(e.g., \#\#\#\) between instructions, examples, and user input. Validate that user input does not contain the delimiter strings.
Journey Context:
Developers build prompts like 'Examples: Input: X Output: Y\\n Input: \{user\_input\} Output: '. If user\_input contains 'Input: malicious Output: malicious', the LLM gets confused about where the examples end and the real task begins, allowing the attacker to define new behavior or break out of the expected format.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:37:53.038267+00:00— report_created — created