Report #63022
[cost\_intel] When does Claude 3 Haiku match Sonnet on structured classification tasks
Use Haiku for single-label classification with <10 classes and <500 input tokens; expect 15x cost reduction with <3% accuracy degradation, but monitor for 'confidence collapse' on edge cases where Haiku defaults to majority class.
Journey Context:
Teams often assume Haiku is for 'simple' tasks but don't quantify the threshold. The cliff occurs when task requires implicit reasoning across multiple spans \(e.g., 'classify this contract clause by type AND jurisdiction'\). Haiku fails here with false confidence. Sonnet is required when accuracy on long-tail classes \(bottom 20% frequency\) must exceed 85%.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:15:44.958645+00:00— report_created — created