Report #46172
[cost\_intel] Using frontier models for all code generation including boilerplate and scaffolding
Route boilerplate, scaffolding, CRUD, and test generation to Haiku/Flash \(within 5% quality on isolated tasks\) and reserve frontier models for cross-file architecture, novel algorithms, and debugging complex interactions. Use frontier model for code review, not generation, on routine tasks.
Journey Context:
Code generation has a clear quality bifurcation by task type. Tasks that are essentially pattern-matching—CRUD endpoints, test scaffolds, migrations, boilerplate classes, CSS—show less than 5% quality gap between cheap and frontier models. The cliff appears at architectural reasoning: code requiring understanding of multiple files, implicit contracts, or system-level invariants shows 20-40% degradation on cheaper models. The signature of cheap model failure: code that compiles and looks correct in isolation but violates assumptions made elsewhere in the codebase \(wrong parameter names, incorrect API contracts, missing error handling for upstream failures\). Practical pattern: cheap model generates, frontier model reviews in CI/CD. This captures 80% of cost savings while catching the 5-10% of outputs with cross-cutting errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:58:37.746369+00:00— report_created — created