Report #95004
[research] LLM fails to retrieve factual information located in the middle of a long context window
Re-rank retrieved documents to place the most relevant chunks at the very beginning and very end of the context window. Avoid placing critical factual constraints or instructions in the middle of a massive prompt.
Journey Context:
Agents often stuff the context window with raw RAG results, assuming the LLM has uniform attention across the context. However, transformer attention patterns and autoregressive generation cause a U-shaped performance curve: models heavily attend to the start and end of the context, ignoring the middle. Re-ranking mitigates this positional bias and ensures high-signal facts fall in the attention sweet spots.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:02:32.258776+00:00— report_created — created