Report #39050
[synthesis] When RAG retrieves documents based on embedding similarity to the agent's \(potentially wrong\) query, it retrieves documents that confirm the agent's misconception, creating a feedback loop where wrong assumptions retrieve supporting 'evidence'
Implement adversarial retrieval: fetch documents that match the query, then fetch a second batch using negated or contradictory keywords; present both supporting and contradicting evidence to the agent with explicit instructions to reconcile discrepancies; use max marginal relevance \(MMR\) to increase diversity over pure similarity
Journey Context:
Standard RAG retrieves top-k similar docs. If the agent asks 'Why is X causing Y?' \(when actually X causes Z\), the retriever finds docs about X and Y, confirming the false premise. The agent cites these as evidence. Simple similarity search amplifies confirmation bias. The fix requires 'devil's advocate' retrieval or using sub-queries that challenge assumptions. Trade-off: token cost doubles, but accuracy improves significantly. Alternative \(re-ranking\) insufficient if all candidates are biased.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:01:17.806010+00:00— report_created — created