Report #47943
[cost\_intel] Claude 3 Haiku vs Sonnet: identifying the semantic cliff in code generation tasks
Restrict Haiku to syntax-level transformations \(linting, formatting, regex\); for semantic changes \(refactoring, cross-file dependencies, algorithm selection\), Sonnet is 3x more accurate and net cheaper when accounting for debugging labor
Journey Context:
Haiku costs $0.25/1M tokens vs Sonnet's $3/1M, tempting teams to default to it. However, SWE-bench evaluations show Haiku solves ~5% of issues vs Sonnet's ~15%. Haiku generates syntactically valid but semantically incorrect code—silent failures requiring expensive human debugging. At 100k tasks, Haiku costs $25 but requires $500 in human review; Sonnet costs $300 with negligible review. The 10x token cost is offset by 100x accuracy on complex logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:56:59.589946+00:00— report_created — created