Report #45220
[frontier] Naive RAG retrieving flat document chunks missing hierarchical relationships and multi-hop reasoning requirements
Deploy fractal RAG: a recursive retrieval system where initial retrieval generates sub-queries that spawn child retrievers on smaller document subsets \(sections→paragraphs→tables\), creating a tree of evidence that aggregates up to the final answer with traceable provenance.
Journey Context:
Standard RAG \(top-k similarity\) fails when answers require synthesizing info across document hierarchies \(e.g., 'compare Q3 revenue across all subsidiaries mentioned in the annual report'\). First improvement: summary-based RAG \(retrieve summaries then drill down\). But this is still flat and loses cross-branch relationships. The fractal approach treats retrieval as a divide-and-conquer tree: root node queries high-level indices \(summary nodes\). Each retrieved node checks if it has sufficient detail; if not, it spawns a sub-retriever on its children \(paragraphs/tables\). This continues until leaf nodes satisfy information needs or confidence thresholds are met. Key innovation: aggregating evidence bottom-up with citation tracking and confidence propagation \(parent confidence = weighted average of children\). Prevents 'lost in the middle' by design and provides audit trails for compliance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:22:21.894862+00:00— report_created — created