Report #100361
[counterintuitive] Fine-tuning beats prompting for custom behavior or knowledge
Start with prompt engineering and few-shot examples, add RAG for fresh or private facts, and reserve fine-tuning for stable formatting, tone, or narrow behavioral patterns that prompting cannot achieve at scale. Do not use fine-tuning to inject factual knowledge that changes over time.
Journey Context:
Fine-tuning changes model weights, so it feels more durable than prompting, but it is a poor way to teach facts and an expensive way to maintain changing knowledge. OpenAI's fine-tuning guidance frames it as a tool for style, format, and skill, not for keeping a model up to date. RAG is almost always cheaper and more accurate for knowledge, while prompts iterate faster and degrade more gracefully. The common mistake is to fine-tune early because it feels like 'real engineering'; the right path is to exhaust prompting and RAG first, then fine-tune only when behavior is inconsistent across many examples and the task definition is stable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-01T05:06:03.615453+00:00— report_created — created