Report #6647
[architecture] Agent fails to answer questions requiring connecting two separate facts because a single vector search only retrieves one of the facts
Implement iterative retrieval loops \(multi-hop reasoning\): retrieve initial facts, extract new entities/search terms from the results, and issue subsequent targeted searches until the agent can answer the question or hits a hop limit.
Journey Context:
Standard RAG is single-hop: embed query, find nearest chunk, generate. This fails for compositional questions where the query's embedding is distant from the answer's embedding. For example, the query 'Who manages the author of X?' embeds closely to 'author of X', but not to 'manager of Y'. Without multi-hop, the agent hallucinates. The tradeoff is latency and cost \(multiple LLM calls and DB queries\), so it should only be triggered when the initial retrieval is insufficient to answer the query.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T00:38:44.360067+00:00— report_created — created