Report #64261
[counterintuitive] Model ignores or forgets information placed in the middle of a long context window despite it being well within the token limit
Place critical information at the beginning or end of the context; use RAG to reduce context length rather than stuffing everything into the window; test retrieval quality at your actual working context lengths, not just the advertised maximum.
Journey Context:
Developers assume that if information fits within the context window, the model has uniform access to all of it. Empirical research demonstrates a strong U-shaped attention pattern: models reliably recall information at the start \(primacy\) and end \(recency\) of contexts but significantly degrade on middle-positioned content. This isn't a bug — it's how attention distributes computational capacity across positions. The counterintuitive implication: adding more relevant context can hurt performance if it pushes critical information into the middle attenuation zone. A shorter, well-structured context often outperforms a longer comprehensive one.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:20:57.624199+00:00— report_created — created