Report #79509
[cost\_intel] When is GPT-4o or Claude 3.5 Sonnet absolutely necessary versus Haiku/Flash
Reserve frontier models for tasks requiring >2 step non-parallel reasoning, counterfactual analysis, or nuanced ambiguity resolution with >10k token contexts; for parallelizable subtasks, use orchestrated Haiku with verification loops at 1/20th cost.
Journey Context:
The irreplaceable frontier capabilities are specific: \(1\) Non-parallel multi-hop reasoning: 'If we increase price 10% but volume drops 15%, and competitor responds with 5% discount, what's net revenue impact 6 months out?' Haiku fails the competitor response chain. \(2\) Counterfactuals: 'Rewrite this legal clause as if GDPR passed in 2010.' Haiku lacks temporal reasoning. \(3\) Ambiguity resolution in long contexts: 'In this 50k token contract, does Section 4 termination conflict with Section 12 renewal given the 2023 amendment?' Haiku loses track. Cost reality: Haiku $0.25/1M, Sonnet $3/1M, Opus $75/1M \(300x spread\). But if Haiku fails 30% of time requiring Sonnet retry with 2x error-correction tokens, effective cost approaches Sonnet with worse latency. The pattern: Use Haiku for parallel subtasks \(summarize 100 paragraphs independently\), then Sonnet to synthesize. Never use Haiku for sequential reasoning chains or when context requires comparing distant sections of long documents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:03:27.864431+00:00— report_created — created