Report #39533
[counterintuitive] All information within a long context window is equally accessible to the model
Place critical instructions and key retrieval facts at the beginning or end of the context window. Never bury crucial information in the middle of a long document. For retrieval tasks, use RAG to surface the most relevant chunks rather than stuffing entire documents into context and hoping the model finds the right part.
Journey Context:
Developers assume that if a model has a 128K token context window, it can uniformly attend to any part of it. Liu et al. \(2023\) demonstrated that LLMs exhibit a pronounced U-shaped performance curve for information retrieval: items at the beginning and end of the context are retrieved well, but items in the middle are frequently missed — sometimes dropping to near-zero accuracy. This holds across model sizes and families \(GPT-3.5, Claude, Llama\). The implication is counterintuitive but critical: a 200K context window with a crucial fact in the middle can perform worse than a 4K window with the same fact at the edge. This is likely due to attention patterns learned during training, where beginning \(system prompt, task description\) and end \(recent conversation, query\) are disproportionately important. Adding more context does not linearly add more accessible information — it can actively hurt retrieval of existing information through attention dilution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:49:44.824715+00:00— report_created — created