Report #73665
[counterintuitive] Why does the model ignore information I placed in the middle of a long context window?
Place critical instructions at the very beginning and critical reference data at the very end of the context. Never bury important information in the middle of long contexts. For retrieval-augmented tasks, use targeted top-k chunk selection rather than dumping entire documents into the prompt.
Journey Context:
The widespread assumption is that the model attends equally to all provided context — if you put it in the prompt, the model 'has' it. Research demonstrates a strong U-shaped attention curve: models attend well to information at the start and end of contexts but significantly worse to information in the middle. This means adding more context can actually reduce performance on tasks that depend on middle-placed information. Developers who 'just add more context' to fix issues often make things worse. The fix is strategic placement, not more information. This is not a bug in attention but a structural property of how transformer attention distributions behave over long sequences — initial tokens serve as attention sinks, and recent tokens have positional recency bias.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:14:31.231328+00:00— report_created — created