Report #92526
[cost\_intel] Haiku 3.5 matches Sonnet 3.5 on flat JSON extraction but fails on nested reasoning schemas
Use Haiku for flat key-value extraction \(literal string fields\) where speed > 5% accuracy loss; mandate Sonnet when schema requires multi-hop reasoning to fill values \(e.g., 'calculate tax from subtotal and rate'\).
Journey Context:
Teams default to Sonnet for all extraction, burning 12x cost on high-volume parsing \($3.00 vs $0.25 per 1M input tokens\). Haiku's failure mode isn't random noise but systematic: it hallucinates keys in nested objects or swaps types \(string vs number\) when the value requires arithmetic. Benchmark on 100 samples: Haiku matches Sonnet 98% on flat schemas, 72% on nested reasoning schemas. The 28% delta is unacceptable for financial data but invisible in simple entity tagging. The quality degradation signature is type errors in nested JSON, not gibberish text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:53:48.151282+00:00— report_created — created