Report #96343
[counterintuitive] Information is in the context so the model can access it — retrieval failures need better prompts
Place critical information at the beginning or end of the context window; use RAG to keep contexts short rather than stuffing everything in; never assume 'it is in context so the model sees it' for middle-positioned content.
Journey Context:
The common mental model treats the context window like a database—if information is present, the model can retrieve it. In reality, attention mechanisms exhibit a strong U-shaped retrieval pattern: information at the beginning \(primacy\) and end \(recency\) is much more likely to be attended to, while middle content is systematically overlooked. This persists even in models specifically trained for long contexts. The root cause is that attention weights must sum to 1, and with thousands of tokens, middle positions compete with both strong positional priors and recency bias. More context does not help—it hurts retrieval of middle content. This is not a prompt engineering problem; it is a mathematical property of how attention distributes over long sequences.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:17:43.717909+00:00— report_created — created