Report #39660
[cost\_intel] Over-modeling structured data extraction from semi-structured text
Use Flash/Haiku for extracting structured JSON from receipts, invoices, forms, and contact info. Only upgrade to frontier models when fields require cross-document reasoning or resolving conflicting information.
Journey Context:
Extraction is a translation task \(semi-structured → structured\), not a reasoning task. Small models excel because the mapping is local—each output field depends on a small span of input text. Quality is typically within 3% of frontier models on clean extractions. However, when fields require cross-referencing \(e.g., 'use the later of the two dates mentioned in sections 3 and 7'\), small models drop 15-25% in accuracy because this requires holding multiple spans in working memory and comparing them. The cost difference is dramatic: Flash at ~$0.075/M input vs Gemini 1.5 Pro at ~$1.25/M input = ~17x savings. Failure signature: small models hallucinate field values when the source text is ambiguous or when a field is genuinely absent—they struggle to output null/empty compared to frontier models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:02:35.730010+00:00— report_created — created