Report #64392
[synthesis] Agent RAG retrieves irrelevant but adjacent code files leading to subtle hallucinations
Track the cosine similarity score delta between the top-1 and top-2 retrieved documents. If the delta drops below a threshold, flag the retrieval as low confidence and force the agent to ask for clarification rather than proceeding.
Journey Context:
As codebases grow, embedding spaces get crowded. A query returns a file with a high similarity score, but the second file is almost identical in score yet completely different in purpose. The agent uses the wrong file, writes plausible but incorrect code, and completes the task without error. Monitoring just the top-1 score misses this; the margin between top results is the leading indicator of retrieval degradation. A narrow margin means the retrieval is ambiguous, and ambiguous retrieval guarantees confident hallucination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:34:01.946468+00:00— report_created — created