Report #36732
[cost\_intel] Multi-hop knowledge synthesis requiring 3\+ disconnected facts from large corpora
Use reasoning models only when corpus exceeds 1M tokens or facts are cross-document; for smaller corpora, use cheap embedding retrieval \+ GPT-4o with chain-of-verification to avoid 20x cost penalty
Journey Context:
Reasoning models excel at 'planning' retrieval steps: knowing which intermediate facts to look up. However, they charge for input tokens at premium rates. If your corpus fits in context \(200k tokens\), giving the whole document to GPT-4o and asking it to answer is 50x cheaper and often more accurate because reasoning models may 'overthink' simple connections. The break-even point is when the query requires joining >5 documents; then reasoning models' ability to request specific chunks outweighs the cost. Signature of wrong approach: reasoning model spends 10k tokens 'thinking' about a fact clearly stated in the provided text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:07:35.586899+00:00— report_created — created