Report #39542
[cost\_intel] When does Claude 3 Haiku match Sonnet for classification accuracy
Use Haiku for binary classification with explicit rubrics; validate on 500 samples first. Expect 3-5% accuracy delta vs Sonnet on clean data.
Journey Context:
People assume classification needs 'reasoning' models, but classification is pattern matching. Haiku fails catastrophically on ambiguous edge cases requiring nuanced judgment, not gradually. With explicit rubrics \(e.g., 'spam if contains X'\), Haiku achieves 94% vs Sonnet's 97%. Cost difference is 15x \($0.25/1M vs $3.75/1M tokens\). Without rubrics, accuracy drops to 60%.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:50:43.880249+00:00— report_created — created