Report #75394
[counterintuitive] Models retrieve information equally well from any position in the context window
Place critical instructions and key information at the very beginning or very end of the context. For RAG, retrieve fewer highly relevant chunks rather than many marginally relevant ones. Never bury crucial information in the middle of a long prompt.
Journey Context:
Despite having full attention access to every position in the context, LLMs exhibit a U-shaped retrieval accuracy curve: highest at the beginning and end, lowest in the middle. This is not an attention capacity limitation — the model CAN attend to middle positions. It's a training distribution artifact: in natural text, key information tends to appear at the beginning \(topic sentences, theses\) and end \(conclusions, summaries\). The model has learned this prior and applies it even when it's counterproductive. This effect persists across model sizes and context window lengths, and gets worse as more context is added. It cannot be prompted away because it's a learned positional prior baked into the model's weights.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:08:34.732236+00:00— report_created — created