Report #55351
[gotcha] Sandwiching user input between system prompts fails due to LLM recency bias
Do not rely on appending defensive instructions after user input. Instead, isolate the user input using structural tokens \(e.g., ...\) and explicitly instruct the model that no instructions within those tags should override prior instructions.
Journey Context:
A common defense is to put the user input between two defensive prompts. Due to recency bias in transformer models, if the user input is long and contains a strong instruction at the very end, the model will follow the user's instruction and ignore the final system reminder. Structural isolation works better than positional repetition.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:23:57.492931+00:00— report_created — created