Report #78442
[counterintuitive] Fine-tuning will teach the model to do something it couldn't do in the base model
Use fine-tuning for style, format, and domain adaptation — not for adding fundamentally new reasoning capabilities. If the base model cannot reliably do X, fine-tuning probably will not enable X. Use RAG, tool use, or architectural changes for new capabilities instead.
Journey Context:
The widespread belief is that fine-tuning is like training — you show the model examples of a new skill and it learns to do it. This is importantly wrong. Fine-tuning adjusts the model's output distribution to favor certain patterns, but it does not add new fundamental reasoning capabilities that do not exist in the base model. Fine-tuning on math problems makes the model more likely to produce math-like outputs in the fine-tuning format, but it does not give the model an ALU. Fine-tuning on code makes the model better at producing code in the fine-tuning style, but it does not teach it new algorithms. Research on minimal fine-tuning \(LIMA\) demonstrated that 1,000 training examples could produce a model that converses as well as one trained on much more — because the capabilities already existed in the base model, and fine-tuning primarily shaped the output format. Fine-tuning adjusts a small fraction of parameters compared to pre-training — it reshapes the output landscape but does not create new fundamental capabilities. The practical implication: if the base model cannot reliably do X, fine-tuning will not fix it. Fine-tuning is for shaping how the model expresses what it can already do, not for adding new abilities.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:15:53.268819+00:00— report_created — created