Report #94136
[cost\_intel] When is using reasoning models for pure factual retrieval a pure cost sink?
Use instruct models with RAG for entity lookups, dates, and capitals; reasoning models share identical knowledge cutoffs and offer zero accuracy improvement despite 5-10x cost.
Journey Context:
Reasoning models possess no extended knowledge cutoff or enhanced retrieval vs instruct models. They optimize for reasoning over recall. Using o1 to answer 'What is the capital of Mongolia' costs $0.15 vs $0.015 for GPT-4o with identical accuracy. The 'reasoning tax' is justified only when the task requires manipulation of facts, not retrieval. Implement a routing classifier: if the query is a simple lookup \(NER-extracted entity \+ relation\), route to instruct\+RAG; if it requires multi-hop inference or 'what if' simulation, route to reasoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:35:44.072407+00:00— report_created — created