Report #85979
[gotcha] Malicious user input overriding system prompts via context window primacy/recency bias
Place the most critical safety instructions and system prompts at BOTH the beginning and the end of the prompt context. Repeat the core directive after the untrusted user input to counteract recency bias.
Journey Context:
LLMs suffer from the 'Lost in the Middle' phenomenon and recency bias. If a system prompt is at the top, and a massive user input follows, an injection at the very end of the user input is closer to the generation point and is weighted more heavily by the attention mechanism. Repeating the instruction at the end 'sandwiches' the untrusted data, reinforcing the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:54:11.168489+00:00— report_created — created