Report #94322
[counterintuitive] With a 128k context window the model can reason over all provided information equally well
Place the most critical information at the beginning and end of your context window. If using RAG, put the highest-ranked documents first and last. Do not bury crucial instructions or data in the middle of a long prompt. For agent workflows, consider breaking long contexts into multiple focused calls rather than one massive prompt.
Journey Context:
The assumption with large context windows \(128k, 200k tokens\) is that all information within the window is equally accessible to the model. Research demonstrates this is false: models exhibit a U-shaped performance curve for information retrieval. Information at the beginning and end of the context is retrieved reliably; information in the middle is significantly less likely to be used, even when it's directly relevant to the query. This 'lost in the middle' effect persists regardless of model size and is a property of the attention mechanism's behavior on long sequences. Adding more context can actually hurt performance on middle-placed information. The practical fix is structural: organize your context so that the most important information occupies the primacy and recency positions, and use multiple shorter calls when possible rather than stuffing everything into one long context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:54:19.576095+00:00— report_created — created