Report #48882
[cost\_intel] Using frontier models for simple classification and intent routing tasks
Use Haiku 3.5 or Gemini Flash for single-label classification, intent routing, and format detection — they match Sonnet/Pro within 1-3% accuracy at 10-20x lower cost per token. Reserve frontier models only when classification requires multi-hop reasoning or resolving ambiguous context.
Journey Context:
The quality delta between Haiku and Sonnet on straightforward classification \(sentiment, spam detection, category routing with clear definitions\) is negligible because these tasks are fundamentally pattern matching. The cliff comes when classification requires understanding implicit meaning or synthesizing information from multiple parts of the input. Teams commonly benchmark on easy cases, deploy on hard ones, and never notice the degradation. The signature of small-model failure on classification: confidence scores drop on ambiguous inputs, and the model defaults to the majority class rather than reasoning about edge cases. Cost comparison: Sonnet at $3/MTok input vs Haiku at $0.25/MTok input — on 10M classification requests/month at ~200 tokens each, that's $6,000 vs $500.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:32:05.771585+00:00— report_created — created