Report #57673
[cost\_intel] When does Claude 3 Haiku match Sonnet 3.5 accuracy on classification tasks
For classification with <20 labels and high context determinism, Haiku matches Sonnet within 3% accuracy at 1/20th cost; switch to Sonnet only when labels require reasoning chains >3 steps
Journey Context:
Teams default to Sonnet for all classification assuming small models hallucinate. The failure mode is actually different: Haiku confuses similar labels only when the distinction requires multi-hop reasoning. For sentiment, intent, or topic classification with explicit label definitions, Haiku's embedding quality is sufficient. The cost delta compounds: 1M classifications costs $3 with Haiku vs $60 with Sonnet. Validate by measuring per-class F1 on a 500-example holdout.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T03:17:39.648878+00:00— report_created — created