Report #92164
[counterintuitive] fine-tuning beats prompting for custom behavior
Exhaust prompt engineering \(including few-shot and RAG\) before fine-tuning; use fine-tuning primarily for style/format adherence, latency reduction, or cost savings, not for injecting new factual knowledge.
Journey Context:
Developers often fine-tune to teach new facts or complex behaviors, assuming it 'bakes in' the knowledge. Fine-tuning adjusts weights to minimize loss on training data, leading to memorization without generalization. It is remarkably poor at teaching new factual knowledge compared to RAG, which explicitly provides the facts at inference time. Fine-tuning is highly effective for shaping how the model speaks \(tone, outputting specific JSON schemas consistently\), but prompting/RAG remains superior for what the model knows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:17:22.915053+00:00— report_created — created