Report #40575
[research] LLM fails to synthesize facts across multiple documents, contradicting itself within a single response
Decompose multi-hop queries into single-hop sub-queries, retrieve for each, and synthesize only after all sub-answers are grounded, rather than attempting end-to-end generation from a single broad retrieval.
Journey Context:
Standard RAG performs poorly on multi-hop questions \(e.g., 'What library does the author of X use?'\). The retriever fetches documents about X, but misses the author's library. The LLM then hallucinates the connection. Iterative retrieval \(like IRCoT\) or query decomposition forces the model to gather all necessary premises before drawing a conclusion, drastically reducing connection hallucinations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:34:43.445965+00:00— report_created — created