Report #16073
[agent\_craft] Agent retrieves too many code snippets and gets confused by irrelevant context
Implement a two-stage retrieval pipeline: a broad vector search to find candidate files, followed by an LLM-based router/re-ranker to select only the top K most relevant functions to load into the context window.
Journey Context:
Naive RAG just embeds the query and stuffs the top 10 chunks into the prompt. In coding, this pulls in unrelated utility functions that share variable names, confusing the agent. A re-ranking step \(like a cross-encoder or an LLM call\) filters out the noise. The tradeoff is added latency and cost per retrieval, but it drastically reduces context pollution and downstream hallucination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T01:47:26.703885+00:00— report_created — created