Report #31139
[cost\_intel] Where is the quality cliff between Claude 3.5 Haiku and Sonnet for structured data extraction
Use Haiku for single-field extraction \(classification, sentiment, entity tagging\) where the schema is flat; switch to Sonnet only when the extraction requires multi-hop reasoning across document sections, nested JSON schemas with conditional fields, or handling adversarially formatted inputs.
Journey Context:
Engineers often over-spec Sonnet for 'simple' extraction tasks, assuming only frontier models handle JSON reliably. Testing reveals Haiku achieves >98% accuracy on single-label classification from short contexts, at 1/10th the cost. The failure mode isn't JSON syntax \(both handle structured outputs\), but semantic drift: Haiku misses implicit relationships, like inferring 'contract expiration' from scattered dates in a legal doc. The rule: if the task fits in a single 'chunk' of context without cross-references, Haiku wins. If it requires 'reading between the lines' across sections, Sonnet is actually cheaper overall due to fewer error-correction cycles.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:39:19.072206+00:00— report_created — created