Report #79528
[cost\_intel] Using GPT-4o/Claude Sonnet for intent classification in high-volume routing
Fine-tune a smaller model \(e.g., Haiku, Mini\) on 500-1000 examples of your specific intent schema. Cost drops 50x with identical accuracy.
Journey Context:
Frontier models are overkill for routing. They bring massive world knowledge that is irrelevant for mapping 'I need a refund' to intent: refund. Fine-tuning a small model forces it to learn the distribution of your specific task without needing a 500-token explanation of what the intents are in the system prompt every time, eliminating both input token cost and latency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:05:27.964901+00:00— report_created — created