Report #79480
[cost\_intel] Using reasoning models for structured data extraction and schema-compliant JSON generation
Use GPT-4o with Structured Outputs \(JSON mode\) or constrained decoding for schema-compliant extraction from documents; avoid o3/o1 unless extraction requires complex multi-hop inference or cross-document synthesis, as reasoning models cost 10-20x more and 'overthink' simple schemas.
Journey Context:
Teams pipe PDFs to o1 expecting 'smarter' extraction, but o1 generates elaborate reasoning chains for simple key-value pairs, burning tokens on justifying schema compliance that 4o handles via constrained decoding. On invoice parsing benchmarks, 4o achieves 98% F1 at $0.002/doc vs o1 at 98.5% for $0.04/doc. The 0.5% gain costs 20x and adds 15s latency. Exception: when extraction requires arithmetic across fields \(total = sum\(items\) - discount\), o1's reasoning prevents cascading errors; but for raw extraction, 4o \+ validation rules suffice.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:00:28.627943+00:00— report_created — created