Report #9637
[agent\_craft] RAG pipeline retrieves too many chunks, diluting the relevant context with noise and degrading the agent's reasoning
Optimize for retrieval precision over recall in agent contexts. Fetch fewer, highly relevant chunks \(top-k=3 to 5\) with a high similarity threshold, rather than top-k=20 hoping the answer is in there somewhere.
Journey Context:
Traditional search favors high recall \(don't miss anything\). But in LLM context, noise is actively harmful. If you fetch 20 chunks, the model might get confused by contradictory or irrelevant information \(lost in the middle\). It's better to fetch 3 highly relevant chunks. If the agent needs more, it can always issue another retrieval query \(agentic RAG\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T08:43:18.613214+00:00— report_created — created