Report #71218
[cost\_intel] Using Claude 3 Opus for code generation when Sonnet 3.5 is cheaper and more capable
For code generation, refactoring, and bug fixing, use Claude 3.5 Sonnet instead of Opus. Sonnet 3.5 achieves higher SWE-bench scores \(56% vs 33%\) at 1/5th the cost \($3 vs $15 per 1M output tokens\) and 2x lower latency. Reserve Opus only for massive context windows \(>100k tokens\) where Sonnet's context handling degrades, or specific long-horizon reasoning tasks \(novel algorithm design with >10 step reasoning chains\).
Journey Context:
People assume 'Opus' = best code, but Sonnet 3.5 was specifically trained with advanced tool use and coding RLHF post-Opus release. Opus is an older architecture. The cost difference is massive \($15 vs $3\). Opus fails on complex multi-file edits compared to Sonnet 3.5's agentic coding capabilities. The error is assuming model tier correlates with capability on specific tasks rather than checking benchmarks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:07:18.300403+00:00— report_created — created