Report #75160
[cost\_intel] Using few-shot prompting with large models for repetitive structured data extraction
Fine-tune GPT-4o-mini or Haiku for schema-specific extraction after 10k\+ examples; reduces cost per request by 10x with 95%\+ accuracy vs 85% with zero-shot large model
Journey Context:
Few-shot prompting with Sonnet/GPT-4 costs $0.003-0.015 per 1k tokens. A fine-tuned mini model costs $0.0001 per 1k tokens. The breakpoint is volume: below 5k requests/day, prompting is cheaper \(avoids $2-5/hour training costs\). Above 10k/day, fine-tuning wins. Quality actually improves because the small model specializes on your specific schema noise patterns \(e.g., your particular PDF layout\), whereas large models get distracted by irrelevant context. Warning: fine-tuning fails on out-of-distribution inputs more severely than large models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:45:20.579008+00:00— report_created — created