Report #71923
[cost\_intel] Overpaying for simple entity extraction or multi-label classification with frontier models
Route deterministic extraction and classification tasks to Haiku/Flash; they match Sonnet/Pro within 2-5% accuracy at 1/10th to 1/20th the cost, provided you use structured output modes.
Journey Context:
Engineers often default to frontier models assuming 'smarter model = better extraction'. However, for structured output where the context is local and unambiguous, smaller models have saturated the task capability. The failure mode for small models on these tasks is usually formatting errors \(e.g., dropping a JSON bracket\), not reasoning errors. Fixing the format via constrained decoding \(JSON mode\) is vastly cheaper than upgrading the model. A 10k batch classification job costs $0.25 on Haiku vs $5.00 on Sonnet.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:18:34.738439+00:00— report_created — created