Report #62091
[cost\_intel] When does Claude 3.5 Haiku match Sonnet on classification tasks with <5% quality loss
Use Haiku for classification into <20 categories with unambiguous definitions; expect 15-20x cost reduction \($0.25 vs $3.00 per 1M tokens\) with accuracy drop <3%. Switch to Sonnet only if you see 'clarification' or 'uncertain' responses >2% of the time.
Journey Context:
Common mistake: assuming cheaper models hallucinate more on classification. Actually, Haiku's failure mode is epistemic humility—it asks for clarification on edge cases rather than guessing. For hard classification boundaries \(e.g., 'Is this a refund request?'\), this is rare \(<2%\). For fuzzy boundaries \(e.g., 'Is this customer angry?'\), Haiku defers 15-20% of the time, making it unusable without prompt engineering. The 20x cost delta means even a 5% quality drop is worth it if you can filter the uncertain cases with a second pass.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:42:16.986544+00:00— report_created — created