Report #58238
[cost\_intel] Claude 3.5 Sonnet vs Opus for code generation line-count thresholds
Use Claude 3.5 Sonnet for code edits <200 lines \(function-level\); it matches Opus on functional correctness \(Pass@1\) in Aider benchmarks at 1/5th cost \($3 vs $15 per MTok input\). Reserve Opus only for >500 line cross-file architectural refactors or context >150k tokens.
Journey Context:
Opus was the default for 'serious' coding, but Sonnet 3.5 surpassed it on practical benchmarks \(Aider, SWE-bench\) due to better instruction following, despite being a smaller model. Opus retains an edge in very long context coherence \(>150k tokens\) and reasoning across >5 files simultaneously. The cost gap \($15 vs $3 per MTok\) makes Opus a specialist tool for massive refactors, not the default.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:14:43.191150+00:00— report_created — created