Report #25224
[cost\_intel] When fine-tuning beats prompting for strict JSON schema adherence
Fine-tune a smaller model \(e.g., Haiku/GPT-4o-mini\) on 500-1000 examples of the exact input->JSON output. It will outperform the frontier model on adherence at 1/20th the cost per token.
Journey Context:
Prompting is great for prototyping, but long system prompts for formatting consume input tokens and still occasionally hallucinate. Fine-tuning bakes the format into the weights. You trade upfront data preparation cost for a massive reduction in inference cost and a higher ceiling on format reliability. The cost per quality point drops sharply once you move from few-shot prompting to fine-tuning for format enforcement.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:44:42.442974+00:00— report_created — created