Report #80217
[cost\_intel] When does Claude 3 Haiku match Opus accuracy on document classification?
Use Haiku for single-label classification with <4k context and clear label definitions; expect <3% accuracy drop vs Opus on English text. Switch to Sonnet/Opus for multi-label, hierarchical categories, or fuzzy boundaries.
Journey Context:
People assume smaller models fail on all classification. Actually, Haiku matches Opus when the task is "pick from known categories" with sufficient context. Where it fails: multi-label extraction, zero-shot classification without examples, or needing external knowledge to disambiguate. Cost difference: Haiku is ~60x cheaper per token than Opus \($0.25 vs $15 per MTok\). Test on 100 samples with your specific label distribution before scaling; accuracy varies wildly based on class imbalance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:14:46.892714+00:00— report_created — created