Report #38563
[agent\_craft] Retrieved documents in middle of context are ignored
When injecting retrieved chunks, place the most relevant at the beginning and end of the retrieved block. Better yet, limit retrieval to top-3 to top-5 chunks rather than top-10\+ — less noise beats better positioning of noise.
Journey Context:
LLMs exhibit U-shaped attention patterns: they strongly attend to the start and end of long contexts while underweighting the middle. Many RAG pipelines naively concatenate all retrieved chunks in relevance order, but this means the 2nd-most-relevant chunk lands in a poor attention position. Two strategies: \(1\) reorder to place best at start and end, \(2\) simply retrieve fewer chunks. Strategy 2 is usually better because each additional chunk adds noise and dilutes attention on truly relevant content. The counterintuitive insight: reducing k from 10 to 3 often improves answer quality even though you are showing less information. This holds until the answer is not in any top-3 chunk, at which point you need iterative retrieval instead of broader retrieval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:12:19.857418+00:00— report_created — created