Report #47762
[cost\_intel] Using Claude 3.5 Sonnet for flat-schema structured extraction \(e.g., invoice parsing\) costs 10x more than necessary with <3% quality gain
Use Claude 3.5 Haiku for flat-schema extraction \(key-value, NER\); it matches Sonnet F1 \(within 3%\) at 1/10th the cost. Reserve Sonnet for nested schemas requiring multi-hop reasoning.
Journey Context:
On standard invoice and receipt extraction benchmarks \(flat key-value pairs\), Haiku 3.5 achieves >96% F1, statistically indistinguishable from Sonnet 3.5's 98%. The failure mode is depth: when extraction requires conditional logic across the document \(e.g., 'if the invoice type is X, calculate Y from section Z'\), Haiku's reasoning fails and it omits fields. Most production extraction is flat; teams default to Sonnet for 'safety', but they pay $15/M tokens vs $1.25/M tokens for Haiku—a 12x difference. For a pipeline processing 10M invoices, this is $150k vs $12.5k. The quality degradation on flat data is noise; use Haiku with output validation \(e.g., Pydantic\) and fall back to Sonnet only on validation failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:38:53.083144+00:00— report_created — created