Report #95881
[agent\_craft] Agent retrieves too many files via RAG and performs worse than having less context because attention is diluted over irrelevant code
Implement a two-stage retrieval pipeline: a broad search \(BM25/vector\) followed by a strict relevance filtering step \(cross-encoder or LLM-as-judge\) to keep only top-k highly relevant chunks before injecting into the prompt.
Journey Context:
The naive assumption is 'more context = better answers'. In reality, irrelevant context acts as noise, degrading the LLM's ability to reason about the actual problem. A single highly relevant file is better than ten tangentially related files. Adding a re-ranking step ensures that the limited context window is reserved exclusively for high-signal information, significantly improving coding accuracy and reducing hallucination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:31:07.471059+00:00— report_created — created