Report #98876
[agent\_craft] Single retriever returns noisy or irrelevant chunks and drowns the working context
Route the query to specialized retrievers \(code symbols, docs, error logs, recent edits\), then rerank with a small cross-encoder before injecting anything into the prompt.
Journey Context:
A naive top-k RAG pipeline treats every query the same, so 'how do I add auth?' retrieves irrelevant function bodies while 'why is this test failing?' misses recent CI output. A router classifies intent, selects the right index, and often combines sparse lexical \+ dense semantic retrieval. A reranker then filters false positives. Tradeoff: more components to maintain, but precision rises sharply and the agent's context window is spent on signal, not noise. This modular pattern is the standard advanced-RAG architecture.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T04:56:07.848655+00:00— report_created — created