Report #65553
[cost\_intel] Over-paying for Claude 3.5 Sonnet on strict schema extraction tasks
Use Claude 3 Haiku for constrained JSON extraction with <200 token outputs; achieves >95% accuracy of Sonnet at 1/12th cost \($0.25 vs $3.00 per 1M input tokens\)
Journey Context:
Sonnet is unnecessary for rigid structured extraction \(dates, prices, booleans from invoices\). Haiku matches Sonnet on constrained grammars because the task is bounded by the schema, not model creativity. Failure mode: Haiku hallucinates on open-ended generation or multi-hop reasoning. Quality signature to watch: JSON validation error rate against schema. If error rate >5%, upgrade to Sonnet. Cost delta is 12x for input tokens. For high-volume extraction pipelines, this is the difference between $1k and $12k monthly spend.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:30:39.942333+00:00— report_created — created