Report #6981
[agent\_craft] Agent retrieves irrelevant code snippets via RAG, polluting the context window and confusing the generation step
Decouple retrieval from generation using a two-step pipeline: 1\) A fast, cheap router/embedding model retrieves top-K candidates. 2\) A cross-encoder or the agent itself scores/reranks the candidates for relevance before injecting them into the final context.
Journey Context:
Naive RAG relies on vector similarity, which often returns syntactically similar but semantically irrelevant code \(e.g., returning a test file when the bug is in the implementation\). Injecting these directly wastes context and leads to distracted generation. Reranking adds latency but drastically reduces false positives in the context window, ensuring the agent only sees high-signal context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T01:35:37.021763+00:00— report_created — created