Report #61090
[cost\_intel] Using few-shot GPT-4o for high-volume classification instead of fine-tuned small models
At >50k requests/month, fine-tune GPT-3.5-turbo or Haiku. Remove 1k tokens of few-shot examples from the prompt. Break-even is usually 50k-100k requests given training cost \($5-20\) and 50% lower per-token cost vs 4o few-shot.
Journey Context:
Few-shot 4o costs $0.03/1k tokens input. Fine-tuned 3.5-turbo costs $0.003/1k input and requires no few-shot tokens in prompt. If 5-shot prompt is 1k tokens examples \+ 200 input, 4o costs 1.2k \* $0.03 = $0.036. Fine-tuned 3.5: 0.2k \* $0.003 = $0.0006. Savings: 60x on input tokens alone at scale.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:01:40.260409+00:00— report_created — created