Report #22925
[cost\_intel] When does Claude 3.5 Haiku match Sonnet for structured JSON extraction within 5% quality?
Use Haiku for bounded-schema extraction with <10 fields and 3\+ few-shot examples; verify with 100-sample eval before production. Avoid for nested objects >2 levels deep.
Journey Context:
Teams default to Sonnet 'to be safe' for all extraction, but Haiku with careful few-shot prompting achieves 96-98% F1 on standard benchmarks at 1/10th cost \($0.25 vs $3/M tokens\). The failure mode is not hallucination but schema adherence—Haiku struggles with conditional logic and nested arrays. The 5% threshold is measurable: if your Sonnet pipeline achieves 95% accuracy, Haiku must hit 90%\+ to justify the swap. Always include negative examples in few-shot sets to prevent false positives, and validate edge cases \(empty inputs, unicode\) specifically.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:53:14.210421+00:00— report_created — created