Report #99879
[cost\_intel] Cheap model for structured JSON extraction from documents
GPT-4o mini is cost-competitive for extracting well-scoped fields from clean documents, especially when the schema is small and the text is literal. Drop to GPT-4o or o1 when the extraction requires cross-document synthesis, numeric reasoning, or handling corrupted/noisy inputs.
Journey Context:
Mini models shine when the task is 'find the value and format it', not 'interpret the value'. The quality cliff appears first on implied fields, unit conversions, and anything requiring world knowledge. A strong pattern is mini-first extraction with a larger model as a verifier only on low-confidence outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:13:08.174880+00:00— report_created — created