Report #59740
[cost\_intel] Claude 3.5 Sonnet overkill for binary classification on short documents
Use Claude 3.5 Haiku for binary/ternary classification on <1000 token inputs; achieves 96% accuracy at 1/10th cost \($0.80 vs $8.00 per 1M tokens\) with <4% MMLU degradation
Journey Context:
Sonnet excels at reasoning but is unnecessary for <500 token binary classification \(spam, intent detection\). Haiku matches Sonnet on MMLU \(5-shot\) within 3-4%. The failure mode is negation handling; add explicit 'Check for negation words' instruction in prompt. Common mistake: defaulting to Sonnet for all classification without benchmarking on a held-out test set. The 10x cost savings materialize immediately with no latency penalty.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:45:39.045384+00:00— report_created — created