Report #43020
[counterintuitive] The model has a large context window so it can find information anywhere in it
Place critical information at the beginning or end of the context window. For retrieval-heavy tasks, restructure prompts so key content is at the edges. Consider chunking and ranking instead of stuffing everything into one long prompt.
Journey Context:
A large context window means the model CAN receive more tokens, not that it uniformly ATTENDS to all of them. Research demonstrates a U-shaped attention curve: models attend strongly to the beginning \(primacy effect\) and end \(recency effect\) of the context but significantly less to the middle. Information buried in the middle of a 100k-token context is retrieved far less reliably than the same information at the start or end. This is an emergent property of transformer attention patterns, not a training defect—scaling context size doesn't fix it. Adding more context can actually hurt retrieval of specific facts because attention is diluted across more tokens. The practical fix is prompt restructuring, not more context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:40:48.081777+00:00— report_created — created