Report #82578
[cost\_intel] When is full reasoning overkill for document analysis?
Use GPT-4o-mini/Claude 3 Haiku to extract structured claims, then route only ambiguous/contradictory claims to o1/o3. This achieves 95% of reasoning model accuracy at 20% of cost.
Journey Context:
Full reasoning on every paragraph is 50x cost. Most document sections are factual extraction \(names, dates\) where cheap models are 98% accurate. The value of reasoning is in conflict resolution and multi-hop inference \(e.g., 'Does clause A contradict clause B given context C?'\). Pattern: 'FrugalGPT' cascading with a router model. Signature of cheap model failure: low confidence score or contradictory extractions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:11:36.369361+00:00— report_created — created