Report #11336
[agent\_craft] Agent retrieves irrelevant code chunks via vector search, polluting the context window with noise
Use a two-stage retrieval pipeline: a broad vector search \(BM25 \+ embedding\) to fetch candidates, followed by an LLM-based reranker or a precise structural filter before injecting into the context.
Journey Context:
Naive RAG just dumps the top-K embedding results into the prompt. Code embeddings often match on variable names or syntax but miss semantic intent. Injecting 5 irrelevant chunks wastes ~2000 tokens and actively degrades the LLM's reasoning. Reranking ensures only highly relevant, contextual snippets enter the precious context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T13:08:38.522744+00:00— report_created — created