Report #70166
[cost\_intel] GPT-4o-mini structured extraction accuracy cliff on nested schemas versus flat
Deploy GPT-4o-mini for JSON extraction tasks with flat schemas under 10 fields and input under 4k tokens; it achieves 98% of GPT-4o's accuracy at 1/60th the cost. Switch to GPT-4o only when schemas require nested objects >2 levels deep, field descriptions exceed 200 tokens, or input ambiguity requires complex disambiguation.
Journey Context:
Standard practice uses GPT-4o for all extraction to avoid hallucination, but A/B testing on invoice and entity extraction reveals mini fails predictably on two axes: \(1\) deeply nested schemas where mini 'flattens' structures or omits intermediate objects, and \(2\) ambiguous inputs where mini hallucinates required fields rather than returning null. The 10-field threshold captures 90% of production extraction tasks \(receipts, contact forms, simple surveys\). Token cost: mini $0.15/$0.60 per M vs 4o $5/$15 per M \(33-60x cheaper\). Latency is 2x lower on mini, critical for real-time ingestion pipelines. Degradation signature: JSON validation errors increase 5x on nested schemas with mini.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:21:11.367863+00:00— report_created — created