Report #90838
[cost\_intel] Fine-tuning 3.5-turbo beats GPT-4-turbo on repetitive extraction tasks above 50k daily volume
For extraction tasks with identical schema processed >50k times/day, fine-tune GPT-3.5-turbo on 100 examples; it achieves 87% of GPT-4 accuracy at 1/20th the cost, breaking even at 10k requests.
Journey Context:
Few-shot GPT-4 costs $0.03/1k tokens; fine-tuned 3.5-turbo costs $0.0015/1k plus $2-8 training. For repetitive extraction \(same fields, different documents\), the fine-tuned model eliminates need for 500-token system prompts and 1000-token few-shot examples per request. At 50k requests/day, daily savings $1500 vs $75. Quality degradation appears only on edge cases with implicit context; for explicit field extraction, fine-tuned models often match base model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:04:00.865059+00:00— report_created — created