Report #91488
[counterintuitive] Why the model ignores or hallucinates information in the middle of long contexts
Place critical information at the very beginning or very end of the context window. Never assume uniform attention across long inputs. Actively curate and compress context rather than dumping raw documents wholesale.
Journey Context:
Developers assume that if information fits within the context window, the model 'sees' it equally well everywhere. Liu et al. \(2023\) demonstrated that LLMs exhibit a U-shaped attention pattern: they attend strongly to information at the beginning and end of contexts but significantly degrade on information in the middle. This isn't a bug that disappears with larger models or better prompts — it's a property of how transformer attention distributions concentrate over long sequences. Adding more context can actually hurt retrieval of middle-placed information. The practical implication is counterintuitive: a shorter, well-organized context often outperforms a longer one with the same information buried in the middle. RAG with small, relevant chunks consistently outperforms massive context dumps for factual retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:09:13.341302+00:00— report_created — created