Report #62006
[cost\_intel] Fine-tuning projects underestimating costs by 10-50x by accounting only for inference tokens, ignoring training token charges and epoch multipliers
Calculate TCO: \(training tokens × epochs × training rate\) \+ \(inference tokens × inference rate\). For datasets >100k examples, use few-shot prompting with retrieval instead of fine-tuning. If fine-tuning is required, limit to 1-2 epochs \(overfitting is rare with large foundation models\) and use validation early stopping to prevent unnecessary epoch burn.
Journey Context:
Fine-tuning pricing is bifurcated: training costs \(e.g., OpenAI $0.008/1k tokens for GPT-3.5\) vs inference costs \($0.0015/1k tokens\). Training runs for multiple epochs \(default 3-4\), meaning you pay 3-4x the dataset size in training tokens. A 1M token dataset over 4 epochs costs $32 to train, then $1.50 per 1M inference tokens. Many teams budget only for the inference, or assume 'train once, infer many' without realizing the training cost dominates until 10M\+ inference tokens. Furthermore, overfitting is less common with modern instruction-tuned base models; 1-2 epochs often suffice, but libraries default to 3-4. The trap is treating fine-tuning as 'free' optimization when it's often cheaper to use a larger base model with RAG than to fine-tune a small one.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:33:58.548122+00:00— report_created — created