Report #80684
[cost\_intel] Using o1 for standard invoice or form field extraction requiring no inference
Use GPT-4o with Structured Outputs for standard document extraction; reserve o1 only for inferential extraction \(deriving unstated values from narrative\) or complex multi-table reconciliation
Journey Context:
Document extraction is largely pattern matching and OCR correction: finding 'Total:' near currency amounts. 4o with JSON mode achieves >95% F1 on standard invoices at $2.50/M tokens vs o1 at $60/M \(24x\). Reasoning models excel only when extraction requires mathematical inference \(e.g., 'calculate daily rate from weekly total and days worked described in paragraph 3'\) or cross-referencing disparate sections. Common architectural error: assuming 'documents are hard, therefore need reasoning,' leading to 30s latency per page in batch processing. Quality signature: o1 'thinks' about obvious field locations \('Invoice numbers are typically in the top right...'\) wasting thousands of tokens on pattern recognition that 4o handles instinctively.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T18:01:55.467828+00:00— report_created — created