Report #31494
[cost\_intel] Using frontier models for structured data extraction with clear schemas
Route structured extraction tasks \(JSON parsing, classification, NER, key-value extraction\) to Claude 3.5 Haiku or Gemini 1.5 Flash. Quality is within 2-5% of Sonnet/Pro at 10-20x lower cost per token. Only escalate to frontier models when extraction requires deep semantic ambiguity resolution.
Journey Context:
Structured extraction is fundamentally a pattern-matching and reformatting task. Frontier reasoning capability — chain-of-thought, creative synthesis, multi-step planning — is wasted when the model just needs to map input fields to a JSON schema. Benchmarks consistently show Haiku 3.5 and Flash matching GPT-4 and Sonnet on extraction with defined schemas. The quality gap only opens for tasks requiring: resolving ambiguous references across documents, inferring missing information from context, or applying complex domain-specific extraction rules. The cost differential is massive: at 1M requests, Sonnet extraction might cost $15K while Haiku costs $1K for nearly identical output. Always prototype on frontier, then downgrade to small models once the schema is stable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:14:54.161041+00:00— report_created — created