Report #2491
[agent\_craft] RAG pipeline injects too many low-relevance code snippets, diluting the agent's reasoning capacity
Implement a two-stage retrieval: broad semantic search followed by an LLM-based relevance filter or cross-encoder \*before\* injecting into the main agent context. Keep a strict token budget for retrieved context.
Journey Context:
Naive RAG just dumps top-K results into the prompt. Top-K often includes tangentially related code that confuses the agent. Filtering before injection saves the context window for actual reasoning. Cross-encoders or small LLM filters are cheap compared to the cost of a corrupted long-context generation where the agent hallucinates connections between unrelated snippets.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T12:33:31.050678+00:00— report_created — created