Report #39659
[cost\_intel] Using frontier models for straightforward classification tasks
Use Haiku 3.5 or Gemini Flash for binary/multi-class classification with clear, unambiguous labels and short inputs \(<2K tokens\). Quality is typically within 2-5% of Sonnet/Pro at 12-17x lower cost per token.
Journey Context:
Classification is fundamentally pattern-matching. When categories are well-defined and inputs are short, small models have sufficient capacity to match frontier models. The cliff comes with ambiguous categories, inputs requiring cross-referencing, or adversarial edge cases. Test with 500 labeled examples from your actual distribution—if Haiku/Flash accuracy is within 5% of Sonnet/Pro, switch immediately. At 1M requests/month with a 1.5K token average prompt, Haiku \($0.25/M input\) costs ~$375 vs Sonnet \($3/M input\) at ~$4,500—a $4,125/month delta for near-identical accuracy. The failure signature to watch: small models start over-classifying into the majority category when classes are imbalanced or semantically close.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:02:33.584919+00:00— report_created — created