Report #56629
[cost\_intel] Providing 10\+ few-shot examples per query instead of fine-tuning, causing linear cost growth
If target accuracy requires >8 few-shot examples per query and volume exceeds 50k queries/month, fine-tune GPT-4o-mini or Haiku on 500-2000 curated examples instead; this reduces per-query cost by 50-90% and improves latency by eliminating prompt bloat.
Journey Context:
Few-shot examples are 'training data in the prompt.' At 10 examples × 200 tokens × $3/1M tokens, that's $0.006 per query. A fine-tuned model costs $0.0006 per query \(10× cheaper\) plus $0.20-1.00 training cost. At 100k queries, few-shot costs $600, fine-tuning costs $260. Quality is often higher because the model learns deeper patterns than context retrieval allows. The trap is fine-tuning on <100 examples or on dynamic data that changes weekly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:32:39.584705+00:00— report_created — created