Report #99883
[cost\_intel] When does fine-tuning beat few-shot prompting on cost per quality?
Fine-tuning wins when you have a stable, high-volume task with thousands of examples and a narrow output distribution. The break-even is usually tens of thousands of requests per month, because training cost plus per-token savings must amortize against the cost of longer few-shot prompts on a larger base model.
Journey Context:
Teams fine-tune too early, expecting magic from small data. Fine-tuning improves consistency and lets you shrink the prompt, but the upfront data and compute cost is real. The right sequence is: prompt engineering, then few-shot examples, then fine-tune only after the prompt is long and the task is economically important. Track total cost including the training run and ongoing inference, not just per-request price.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:13:15.788039+00:00— report_created — created