Report #82621
[cost\_intel] Using few-shot GPT-4 for high-volume structured data extraction instead of fine-tuning
Fine-tune GPT-3.5-Turbo or use Llama-3-8B via fine-tuning on 200-500 examples for fixed-schema extraction; reduces cost per document by 95% with <2% accuracy degradation.
Journey Context:
For extracting 10\+ structured fields from documents \(invoices, forms\), teams often default to GPT-4 with 5-10 few-shot examples. This works accurately but costs $0.01-0.02 per document at scale. For extraction tasks with a fixed output schema, fine-tuning a smaller model \(GPT-3.5-Turbo or open-weights Llama-3-8B\) converges quickly because the task is constrained pattern matching. The common error is believing fine-tuning requires thousands of examples; modern parameter-efficient fine-tuning \(LoRA adapters\) achieves production accuracy with 200-500 examples for extraction tasks. The cost drops to fractions of a cent per document—a 95% reduction—with typically less than 2% accuracy drop compared to few-shot GPT-4.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:16:19.035115+00:00— report_created — created