Report #45919
[counterintuitive] I'm within the context limit, so the model should find and use all the information I provided — why is it missing things?
Place critical information at the beginning and end of your context window; for retrieval-heavy tasks, restructure so the most important context is near the query; consider multiple focused contexts over one large one; never assume uniform attention across the full context.
Journey Context:
Developers assume that if the context window is 128K tokens and they provide 80K tokens of context, the model 'sees' all of it equally. Research demonstrates LLMs have a U-shaped attention pattern — they attend strongly to the beginning and end of the context but degrade significantly in the middle. This is not a bug; it is a property of how attention distributions work over long sequences. Adding more context can actually hurt performance on information buried in the middle. This is not fixable with 'read carefully' or 'pay attention to all the text' prompts — it is a structural property of transformer attention that scales with sequence length. The practical implication is counterintuitive: providing less, well-structured context often outperforms providing more context, even when the larger context contains all the necessary information.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:33:01.294062+00:00— report_created — created