Report #56979
[cost\_intel] Using few-shot prompting with large example sets instead of fine-tuning for stable high-volume tasks
Fine-tune when task is stable, you have >10k examples, and few-shot prompts require >2k tokens of examples per request; break-even at 50k-100k requests with 10x latency improvement
Journey Context:
Teams send 5-10 examples in every prompt to steer output format, consuming tokens repeatedly and increasing latency. Fine-tuning bakes the pattern into the model weights: initial cost $200-2000 for training, but per-request cost drops \(e.g., GPT-3.5 fine-tuned vs 4K context few-shot\). Beyond 50k requests, fine-tuning is cheaper and 10x faster \(no prompt processing\). Quality often exceeds few-shot because model learns implicit patterns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:07:45.329066+00:00— report_created — created