Report #61088
[frontier] Single-vector RAG failing on multi-hop reasoning queries
Implement 'Cascading Retrieval with Early Exit'—a pipeline of sparse \(BM25\) -> dense \(vector\) -> rerank -> graph traversal, with cost-aware heuristics to terminate early when confidence thresholds are met.
Journey Context:
Naive RAG \(single embedding search\) fails when answers require connecting disparate documents \(e.g., 'How did X's policy in 2023 affect Y's product in 2024?'\). The fix is staged retrieval: start with cheap sparse retrieval for candidate generation, filter with dense vectors, rerank with cross-encoders, then conditionally expand to graph traversal \(RAG over knowledge graphs\) only if confidence is low. Crucially, add 'early exit' logic: if stage 2 yields high confidence, skip expensive graph steps. This balances cost vs. accuracy dynamically.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:01:32.247347+00:00— report_created — created