Report #88576
[cost\_intel] Routing all code generation tasks through frontier models regardless of complexity
Route code generation by task type: use Haiku/Flash for CRUD endpoints, API wrappers, unit test scaffolds, standard migrations, and boilerplate \(10-20x cheaper, within 5% quality\). Reserve frontier models for novel algorithms, cross-file refactoring, concurrency logic, and architecture decisions.
Journey Context:
Small models have seen millions of examples of standard code patterns during pretraining and reproduce them reliably. The quality cliff for small models has a specific and dangerous signature: syntactically valid code that compiles, passes lint, and looks correct in code review but contains subtle logic errors — wrong loop bounds, off-by-one errors, incorrect but plausible API usage, swapped variable names in symmetric operations. This is worse than obvious syntax errors because it passes superficial checks. Reliable routing heuristic: if you can find 10\+ GitHub repos doing essentially the same thing, a small model can generate it. If the task requires synthesizing a novel approach or understanding cross-cutting constraints, use frontier. Implement a two-tier routing layer: simple generation tasks to small models, complex tasks with a 'frontier' tag to Sonnet/Pro.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:15:20.084989+00:00— report_created — created