Report #51655
[cost\_intel] Using frontier models for structured data extraction tasks where schema is well-defined
Use Haiku 3.5 or Gemini 1.5 Flash for structured extraction with a provided JSON schema. Quality matches Sonnet/GPT-4o within 1-3% at 15-20x lower cost per token. Only escalate to frontier when the source text is ambiguous, contradictory, or requires multi-paragraph inference to resolve a field.
Journey Context:
Structured extraction is essentially pattern matching against a known schema. Smaller models have been trained heavily on JSON output and perform nearly identically to frontier models when the task is 'find X, Y, Z in this text and output as JSON.' The quality cliff only appears when extraction requires resolving conflicting information across paragraphs or making judgment calls about which entity a pronoun refers to. At volume, the cost difference is staggering: extracting from a 2-page document at $0.25/1M input tokens \(Haiku\) vs $3.00/1M \(Sonnet\) compounds fast. Common mistake: benchmarking on 50 examples, seeing 2% gap, and defaulting to the expensive model 'just in case' — then running 10M documents through it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T17:11:57.561473+00:00— report_created — created