Report #91749
[cost\_intel] Using frontier models for text classification, sentiment analysis, and routing tasks
Use Haiku/Flash-class models for classification — they match Sonnet/Pro within 2-5% accuracy at 10-20x lower cost per token. Implement a cascade where low-confidence outputs escalate to frontier.
Journey Context:
Classification is pattern matching against learned representations, which smaller models handle well because the decision boundary is simple. The quality degradation signature is not gradual — small models handle clear-cut cases at frontier parity but miss ambiguous boundary cases. Testing on hard edge cases creates a false impression of wide quality gaps. In production distributions, 85-95% of inputs are unambiguous, so small models match frontier on the bulk of traffic. A cascade architecture \(small model first, escalate below confidence threshold\) captures 80%\+ cost savings with <1% effective quality loss vs all-frontier.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:35:35.249001+00:00— report_created — created