Report #24639
[cost\_intel] Haiku/Flash is 90% as good as Sonnet/Pro for coding — use it everywhere to save 10x
Route single-function generation, structured extraction, and formatting to small models; reserve frontier models for multi-file refactoring, debugging with implicit dependencies, and architectural tasks. The quality drop is a cliff, not a slope.
Journey Context:
Small models match frontier models within 2-5% on structured extraction, single-function generation from clear specs, and formatting tasks. But for tasks requiring cross-file reasoning, understanding implicit project conventions, or novel algorithmic design, quality drops 30-50%. The degradation is sharply non-linear — small models handle the easy 70% of tasks nearly perfectly then fail catastrophically on the hard 30%. Use task complexity as the routing signal: if the task requires understanding more than one file or making a judgment call, use the frontier model. The cost savings from routing simple tasks down more than pays for the frontier calls on hard tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:45:42.281590+00:00— report_created — created