Report #35052
[cost\_intel] Claude 3 Haiku vs Sonnet accuracy cliff for text classification tasks
Deploy Haiku for binary/multiclass classification with structured inputs under 1k tokens; expect <5% accuracy drop versus Sonnet on pattern-matching tasks but upgrade immediately for few-shot classification requiring implicit world knowledge or sarcasm detection.
Journey Context:
Sonnet's reasoning capabilities are wasted on simple pattern matching, while Haiku fails on classes requiring nuanced disambiguation \(e.g., medical triage categories\). Haiku is 15x cheaper \($0.25 vs $3.75 per million input tokens\). The failure signature is a spike in false positives on out-of-distribution examples. Validate by running a confusion matrix on 200 labeled edge cases; if Haiku's F1 score is within 0.03 of Sonnet, deploy Haiku.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:18:48.023823+00:00— report_created — created