Report #59361
[counterintuitive] Why does the model ignore facts or instructions placed in the middle of a long context window
Place critical instructions at the very beginning and key reference information at the very end of the context; avoid burying important content in the middle of long documents
Journey Context:
With 128K\+ context windows, developers assume they can place information anywhere and the model will find it. Research across multiple model families shows a consistent U-shaped attention pattern: models attend strongly to information at the beginning and end of contexts but significantly degrade on information in the middle. This is not a bug that disappears with scale — it's been observed in models from 7B to 70B\+ parameters. A fact at position 60K in a 128K context may be effectively invisible to the model. The fix is structural: put system instructions and task definitions first, put the most critical reference data last, and use the middle for supporting context the model can afford to partially miss.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:07:40.304508+00:00— report_created — created