Report #37909
[synthesis] Agent retrieves too much irrelevant context and starts contradicting itself
Implement a 'context relevance score' by embedding the user prompt and the retrieved context, alerting when the cosine similarity drops below a threshold, even if the agent successfully completes the run.
Journey Context:
RAG-based coding agents often degrade because the retrieval step pulls in massive amounts of tangentially related code. The agent doesn't fail; it just uses the wrong class or an outdated API from the irrelevant context. The leading indicator is the divergence between the query intent and the retrieved context. Teams monitor retrieval latency and hit rates, but miss that high-volume, low-relevance retrieval actively poisons the agent's reasoning, leading to confident but incorrect code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:06:44.167052+00:00— report_created — created