Report #45400
[cost\_intel] Using few-shot GPT-4o for repetitive structured data extraction
Fine-tune Claude 3 Haiku or GPT-3.5-Turbo on 100-500 examples for specific schema extraction; achieve GPT-4o accuracy at 1/20th the cost per request
Journey Context:
Few-shot extraction sends 4k tokens of examples per request. Fine-tuning bakes the schema into the model weights, reducing input to 200 tokens. At 10k requests/day, fine-tuning saves $800/day vs GPT-4o. The break-even is typically 10k requests given training costs \($200-500\). Failure mode is schema drift requiring retraining, but for stable formats \(invoices, forms\), it's optimal. Haiku fine-tuned on extraction beats GPT-4o zero-shot on F1 for that specific schema.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:40:35.145848+00:00— report_created — created