Report #98997
[cost\_intel] Using Claude Sonnet for every request is unnecessarily expensive
Route routine tasks—classification, extraction, simple refactoring, file summarization, and routing—to Claude Haiku 4.5. On the SWE-bench Verified leaderboard it scores within roughly 5 percentage points of Sonnet 4.5 while costing about one-third per token. Reserve Sonnet for multi-step reasoning, ambiguous instructions, and tool chains.
Journey Context:
The gap between Haiku and Sonnet has narrowed sharply on real coding tasks. Haiku's failure mode is not routine work; it is sustained reasoning, cross-file planning, and recovering from ambiguous instructions. A router that sends 30-40% of traffic to Haiku cuts aggregate Claude spend materially with minimal quality impact. The quality degradation signature is increased wrong-tool selection or shallow reasoning on multi-hop tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:08:15.513086+00:00— report_created — created