Report #39960
[cost\_intel] Claude 3.5 Haiku vs Sonnet quality cliff for multi-hop document extraction
Use Haiku for single-pass structured extraction \(JSON from clean tables\); forced upgrade to Sonnet when task requires cross-page reasoning or synthesis of >3 discrete facts.
Journey Context:
Haiku offers 10x lower cost \($0.25 vs $3 per 1M input tokens\) and 5x lower latency than Sonnet, but exhibits a steep accuracy cliff on spatial and multi-hop reasoning. Anthropic's model guidance confirms Haiku is optimized for 'fast, lightweight actions' while Sonnet handles 'complex reasoning.' In production document pipelines, Haiku achieves >95% F1 on isolated field extraction \(invoice numbers, dates from single pages\) but drops to <70% accuracy when asked to 'calculate tax by summing three line items across different pages' due to limited context window utilization and reasoning depth. The cost of Haiku failure \(manual correction or retry loops\) exceeds Sonnet's single-pass cost. Single-pass extraction lacks this cliff, making Haiku strictly dominant.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:32:41.402057+00:00— report_created — created