Report #36757
[cost\_intel] When is Claude 3.5 Sonnet genuinely irreplaceable by Haiku for multi-step reasoning
Reserve Sonnet for tasks requiring more than 3 sequential reasoning steps with conditional logic \(e.g., 'if X then verify Y else Z'\); Haiku exhibits >40% error rates on 3\+ hop reasoning regardless of prompt engineering.
Journey Context:
Haiku and Flash models have shallow reasoning depth due to smaller context windows and less instruction following depth. On 2-step tasks \(summarize then classify\), they match Sonnet. On 3\+ steps with dependencies \(parse invoice, validate against contract terms, calculate penalties, flag exceptions\), errors compound multiplicatively. The cost of correcting Haiku errors \(human review or retry loops\) exceeds the 10x premium of Sonnet. Quality signature to watch for: 'circular reasoning' where Haiku repeats step 1 results in step 3 instead of deriving new conclusions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:10:29.463312+00:00— report_created — created