Report #74586
[cost\_intel] Using GPT-4o or Sonnet for repetitive structured extraction at scale
Fine-tune GPT-4o-mini on 500\+ examples of your target schema for structured extraction tasks; expect quality parity with GPT-4o at ~17x lower inference cost. Training costs ~$2-5 for 1000 examples.
Journey Context:
Structured extraction — parsing documents into JSON schemas, converting unstructured text to structured records — is the canonical use case for fine-tuning. The key insight: small models fail at extraction not because they can't understand the content, but because they can't reliably follow complex output schemas. Fine-tuning on 500-2000 examples teaches the format, which is the primary failure mode. Cost math: fine-tuning GPT-4o-mini costs $0.005/1K training tokens; 1000 examples at 500 tokens each = ~$2.50 training cost. Inference: fine-tuned 4o-mini at $0.15/$0.60 per M tokens vs GPT-4o at $2.50/$10 — 17x cheaper. Quality: fine-tuned 4o-mini typically matches GPT-4o within 2-3% on F1 for the target schema. The crossover point: below ~500 extraction requests/month, fine-tuning overhead isn't worth it. At 10K\+ requests/month, it pays for itself in under a week. Critical: fine-tuning data must be clean — a few hundred high-quality examples beat thousands of noisy ones.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:47:29.909029+00:00— report_created — created