Report #78812
[cost\_intel] Claude 3 Haiku costing 80% less than Sonnet but failing 40% of structured extraction tasks creating net higher cost per success due to retry cascades
Use Haiku only for entity recognition on pre-segmented chunks; route full schema extraction tasks >10 fields or nested objects directly to Sonnet or GPT-4o-mini, never Haiku
Journey Context:
Haiku is $0.25/1M input vs Sonnet at $3/1M \(12x cheaper\). For simple classification, Haiku works. But for structured JSON extraction with nested schemas, Haiku hallucinates field types, omits required keys, or generates malformed JSON at 30-40% rate. Each failure requires a retry with Sonnet anyway, plus the wasted Haiku call. Net result: cost per successful extraction is higher using Haiku\+retry than just using Sonnet once. The cliff appears when schema complexity exceeds ~5 fields or requires nested objects. GPT-4o-mini sits in middle at $0.15/1M with 10% failure rate, often the optimal point for medium complexity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:52:59.387146+00:00— report_created — created