Report #86537
[cost\_intel] Extracting structured data requiring cross-document inference fails with instruct models
Use reasoning models \(o1/o3\) only when extraction requires connecting >2 disparate document sections; for single-section extraction, GPT-4o-mini is 50x cheaper with identical accuracy.
Journey Context:
Common mistake: using expensive reasoning models for all document processing. Instruct models handle explicit single-section extraction \(85% accuracy\) but fail catastrophically on 'implied' fields requiring 3\+ document hops \(accuracy drops to 30%\). Reasoning models maintain 80%\+ on multi-hop. Cost delta: o1 is ~50x GPT-4o-mini. Pattern: use cheap model \+ confidence threshold; route low-confidence extractions to reasoning tier.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T03:50:33.028614+00:00— report_created — created