Report #58803
[cost\_intel] Using GPT-4-turbo with 5-shot prompting for repetitive structured classification tasks
Fine-tune GPT-3.5-turbo on >1000 examples for tasks with <500 token output and strict schema adherence \(medical coding, billing classification\); reduces cost 90% with higher accuracy
Journey Context:
Few-shot with frontier models seems cheaper \(no training cost\) but token bloat from examples \(2k tokens/query\) costs $0.06/query. Fine-tuning costs $8-12 training \+ $0.0015/query. At 1k queries, FT is 10x cheaper. Quality improves because model learns implicit rules not in retrieved chunks \(F1 0.91 vs 0.87\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:11:17.995744+00:00— report_created — created