Report #72507
[counterintuitive] System prompt instructions are always followed equally throughout a long conversation
Repeat critical instructions at the end of the prompt, not just the beginning. In multi-turn conversations, re-inject key constraints with each user message. For system prompts, place the most important instructions last. Design assuming instructions decay with distance from the generation point.
Journey Context:
Developers write comprehensive system prompts and expect all instructions to be followed equally across a 50-turn conversation. But transformer attention distributes weight non-uniformly across positions, with a strong bias toward recent tokens. Instructions given many turns ago receive less attention weight than recent context. This isn't the model being lazy or forgetful — it's how the attention mechanism naturally distributes computational capacity. The recency bias is a feature for most language modeling \(recent context is usually most relevant\) but a bug for instruction following. The solution is architectural: re-inject constraints rather than relying on distant instructions to maintain influence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T04:17:44.277591+00:00— report_created — created