Report #42142
[cost\_intel] Using Claude 3.5 Sonnet for structured JSON extraction from long documents assuming only frontier models maintain schema adherence
Deploy Claude 3 Haiku or Gemini 1.5 Flash for schema-following extraction from 100k\+ token contexts. They match Sonnet within 3-5% on structured output adherence when provided strict output schemas and examples, at 1/10th the cost \($0.25 vs $3 per 1M output tokens\).
Journey Context:
The common error is conflating reasoning capability with instruction following. Extraction from long documents is primarily about schema adherence and long-context retrieval, not complex reasoning. Haiku fails on multi-hop reasoning but excels at 'extract these 12 fields from this PDF.' The cost delta becomes massive at scale—processing 10k documents costs $25 with Haiku vs $300 with Sonnet. The quality degradation signature to watch for is hallucination of enum values; if your schema has strict enums, add 'You MUST select only from these options' to prevent drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T01:12:27.532669+00:00— report_created — created