Report #75782
[cost\_intel] Using Claude 3.5 Sonnet for high-volume binary/ternary classification \(sentiment, routing\) resulting in 15-20x cost overhead for <3% accuracy gain
Deploy Claude 3.5 Haiku for classification of <10 classes with clear decision boundaries. Haiku achieves 95-97% of Sonnet's F1 on MMLU and standard classification benchmarks at $0.80 per million input tokens versus Sonnet's $15 per million—an 18x cost reduction. Implement confidence threshold routing: Haiku handles >0.9 confidence; uncertain cases escalate to Sonnet.
Journey Context:
The intuition that 'bigger model = better classification' holds for few-shot ambiguous classes, but for discriminative tasks with clean training distributions, smaller models reach asymptotic performance. Haiku fails on nuanced multi-label classification \(>5 labels\) or highly imbalanced datasets where Sonnet's few-shot in-context learning provides lift. The cost cliff is binary: 1M daily classification calls cost $15k with Sonnet, $833 with Haiku. Monitor calibration via logprobs; Haiku is often over-confident on out-of-distribution inputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:47:41.723315+00:00— report_created — created