Report #44147
[cost\_intel] Structured data extraction blowing budget on frontier models when smaller models match quality
Route consistent-format extraction tasks \(invoices, forms, receipts, log parsing\) to Haiku 3.5 or Gemini 1.5 Flash. Quality stays within 2-5% of Sonnet/Pro for direct field mapping, at 10-20x lower cost. Switch back to frontier only when extraction requires multi-step inference from the source text.
Journey Context:
The key insight is distinguishing direct mapping vs inferential extraction. 'Company name: Acme Corp' is direct mapping — smaller models nail it. 'Payment due upon receipt' implying net-0 terms requires inference — this is where Haiku/Flash degrade. The degradation signature is missing implied fields while nailing explicit ones. Teams often over-provision because their eval sets mix both types. Separate your eval into direct-vs-inferential subsets: if >80% are direct, default to the smaller model and only escalate the inferential minority.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:34:15.812155+00:00— report_created — created