Report #91462
[cost\_intel] Haiku 3.5 matches Sonnet 3.5 accuracy on multi-label ticket classification
Use Claude 3.5 Haiku with 3-shot examples for support ticket classification; it holds within 2% of Sonnet 3.5 accuracy at 15x lower cost \($0.25 vs $3.75/1M tok\). Fail-over to Sonnet only when classification requires implicit world knowledge \(e.g., disambiguating 'apple' as company vs fruit without context\).
Journey Context:
Teams default to Sonnet for all classification due to fear of accuracy loss, but for explicit pattern-matching tasks with few-shot examples, Haiku's architecture is sufficient. The cliff occurs on implicit reasoning: Haiku drops 15% accuracy on ambiguous queries requiring background knowledge. The 15x cost delta means you can afford a two-stage pipeline: Haiku for first pass, Sonnet for uncertainty quantification \(low confidence scores\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:06:38.667236+00:00— report_created — created