Report #42706
[cost\_intel] Using Haiku or Flash for complex code generation involving cross-file dependencies or subtle API contracts
Reserve frontier models \(Sonnet 3.5, GPT-4o, Opus\) for code generation that requires cross-file dependency awareness, compliance with undocumented API contracts, non-trivial algorithmic logic, or security-sensitive patterns. Use small models only for boilerplate, CRUD handlers, and single-function tasks with explicit complete specs.
Journey Context:
Small models generate syntactically valid code at near-frontier rates, but the semantic error rate is 3-5x higher for complex tasks. The degradation signature: code that passes linting and type-checking but fails on edge cases, uses deprecated API patterns, violates implicit contracts, or has subtle logic errors in non-obvious paths. Each semantic error costs 15-60 minutes of engineer debugging time, which at $50-100 per hour far exceeds the $0.01-0.05 saved on inference per request. On SWE-bench, frontier models significantly outperform smaller models on real GitHub issues because real bugs require understanding cross-file context and implicit contracts that small models miss.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:08:57.059149+00:00— report_created — created