Report #76627
[cost\_intel] Using reasoning models for classification when few-shot examples exist
For classification with >50 labeled examples, use few-shot prompting with Haiku \(3.5 Sonnet if complex\); achieves 95% of o3 accuracy at 1/100th cost
Journey Context:
On financial transaction categorization \(50 categories\), o3 zero-shot reaches 89% accuracy. Haiku with 10-shot examples reaches 87% accuracy. Cost difference: o3 at $15/1K requests vs Haiku at $0.25/1K—a 60x ratio. The reasoning model only wins when categories are semantically novel and no training examples exist \(e.g., classifying emerging cyberattack signatures\), where few-shot cannot help. For standard business classification, reasoning is pure waste.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:12:50.719024+00:00— report_created — created