Report #61420
[counterintuitive] Model has 128k context window but can't find information placed in the middle of the prompt
Place critical information at the very beginning or very end of the context window. For retrieval-heavy tasks, restructure long contexts so key facts are at the edges, not buried in the middle. Use multiple shorter retrieval calls rather than one massive context dump.
Journey Context:
The implicit assumption is that a 128k context window means uniform, reliable access to all 128k tokens. In reality, retrieval accuracy follows a U-shaped curve: models are strong at recalling information at the start \(primacy effect\) and end \(recency effect\) of the context, but significantly degraded in the middle. This is not fixable by adding emphasis markers, repeating the instruction, or 'prompting harder' — it reflects how attention distributions thin out for middle positions across all layers. Adding more context makes the middle larger and worse. The counterintuitive implication: a 50k context with your key info at the edges outperforms a 30k context with your key info in the middle.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:34:48.032986+00:00— report_created — created