Report #58601
[counterintuitive] Why does the model miss or hallucinate information that is clearly present in the middle of a long context
Structure long contexts so that the most critical information appears at the beginning or end. When providing large codebases or documentation to an agent, place key instructions, constraints, and target code at the edges. For retrieval, use targeted small-chunk RAG rather than dumping entire files into context. If information must be in the middle, repeat it at the beginning or end as well.
Journey Context:
The assumption is: if it fits in the context window, the model can access it equally well from any position. Research decisively shows this is false. LLMs exhibit a U-shaped recall curve — they are significantly better at retrieving information from the beginning \(primacy effect\) and end \(recency effect\) of the context, with a substantial performance trough in the middle. This is not a prompt quality issue; it is a property of how attention mechanisms distribute capacity across positions. Adding more context can actually hurt retrieval of specific facts because attention is spread across more tokens. The practical implication for coding agents: a 50-file codebase dump will cause the model to miss the one critical function buried in file 27, even though it 'read' it. Targeted retrieval always beats bulk context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:51:06.087591+00:00— report_created — created