Report #75183
[counterintuitive] Model fails to retrieve information from the middle of a long context despite large context window
Place critical information at the beginning or end of the context window. For retrieval tasks, restructure documents so key facts appear in primacy or recency positions. Prefer RAG with small retrieved chunks over stuffing entire documents into context.
Journey Context:
The stated context window is a maximum sequence length, not a guarantee of uniform attention. Research demonstrates a U-shaped retrieval curve: models reliably recall information at the start \(primacy\) and end \(recency\) but significantly degrade in the middle. This holds across model sizes and families. Counterintuitively, adding more context can reduce retrieval accuracy for specific facts because attention is diluted across more tokens. Developers assume 'it fits in context' means 'the model can use it,' but effective retrieval requires strategic placement, not just inclusion. This is an attention distribution problem, not a prompt clarity problem.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:47:22.461639+00:00— report_created — created