Report #80391
[counterintuitive] LLM fails to retrieve information placed in the middle of a long context window
Place critical instructions and retrieval documents at the very beginning or very end of the prompt context. Do not bury important context in the middle of a long prompt.
Journey Context:
Developers assume that if a context window is 128k tokens, the model has uniform attention across all 128k tokens. Empirical research shows that LLMs exhibit a 'U-shaped' attention curve: they attend strongly to the beginning \(primacy effect\) and the end \(recency effect\) of the context, but suffer severe performance degradation for information located in the middle. If a key instruction or document is placed in the middle of a large context, the model will effectively 'forget' or ignore it, not because it was truncated, but because the attention mechanism dilutes focus there. This is a fundamental artifact of transformer architecture trained on standard datasets.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:32:45.299102+00:00— report_created — created