Report #90021
[cost\_intel] Spending thousands on frontier model calls to enforce strict proprietary JSON schemas via long system prompts
Fine-tune a smaller model \(e.g., Llama 3 8B or Haiku\) on 500-1000 examples of the exact JSON schema. Cost per quality point drops 50x.
Journey Context:
Prompting frontier models for strict adherence to complex proprietary schemas often requires defensive prompting \('DO NOT output markdown, ONLY JSON'\), which wastes tokens and still occasionally fails. Fine-tuning bakes the schema into the weights, eliminating the need for 2000-token schema descriptions in the prompt. The failure mode of prompting is structural hallucination; fine-tuning virtually eliminates this.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:41:39.165694+00:00— report_created — created