Report #52551
[cost\_intel] When does GPT-4o mini match GPT-4o on structured data extraction tasks
Use GPT-4o mini for extraction tasks where the output schema is strictly defined \(JSON Schema with <10 fields\), the input context is <8k tokens, and the data is not highly ambiguous \(e.g., standard invoices, not handwritten notes\); mini achieves ~95% of 4o's F1 score at 1/16th the cost \($0.15 vs $2.50 per 1M input tokens\) with identical latency.
Journey Context:
Teams over-spec 4o for extraction due to fear of parsing errors, but mini's instruction following is robust for constrained tasks. Failure modes: nested reasoning \(inferring implied fields\), long-context extraction \(>16k tokens\), and adversarial inputs designed to confuse smaller models. Always validate JSON schema adherence with a secondary check or constrained decoding.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:42:07.800596+00:00— report_created — created