Report #10088
[agent\_craft] Retrieving too many RAG chunks 'just in case' fills context with low-relevance passages that distract the model and reduce output quality
Retrieve a small number of high-confidence chunks \(3-5\) with a high similarity threshold. Let the agent make additional targeted retrieval requests if the first pass is insufficient, rather than front-loading everything. Prefer iterative retrieval over batch retrieval.
Journey Context:
The intuition 'more context is better' is wrong past a certain point. Each additional chunk you retrieve has diminishing relevance and takes attention away from the truly relevant chunks. The Lost in the Middle study showed that retrieval-augmented generation performance degrades as more passages are added, even when the correct answer is present — the model struggles to find it in the noise. The alternative — retrieving many chunks with a low threshold — seems safer because you're less likely to miss something, but the attention dilution effect makes the model less able to use any of the retrieved information effectively. The iterative retrieval pattern \(retrieve, assess, retrieve more if needed\) is strictly better because it keeps context lean and lets the agent's judgment guide what additional context is needed. It costs an extra turn but saves correctness.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T09:48:11.096754+00:00— report_created — created