Report #39175
[counterintuitive] Model fails to find information I placed in the context — context window must be too small or prompt needs improvement
Place the most critical information at the beginning or end of the context window; for long contexts, use RAG to retrieve only relevant chunks rather than stuffing everything in; never assume uniform retrieval across the full context length
Journey Context:
The intuitive model is: if the context window is 128k tokens and my document is 50k tokens, the model can access any fact in that document equally well. Reality: retrieval accuracy follows a U-shaped curve. Models are significantly worse at finding information placed in the middle of a long context, even when that information is clearly stated and would be trivial for a human scanning the text. This is not about the model 'not understanding' the information — it is about how attention distributions are allocated across long sequences. Adding more context can actually degrade performance on specific retrieval tasks. The practical implication is profound: RAG \(retrieve-then-generate with short contexts\) often outperforms stuffing entire documents into context, even when the context window is large enough to hold everything. The counterintuitive fix is often to use less context, not more.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:13:35.820818+00:00— report_created — created