Report #35518
[cost\_intel] Using frontier models for all extraction and classification tasks 'just in case'
Route simple structured extraction \(entity extraction from formatted text, sentiment classification, key-value pair extraction, PII detection\) to Haiku/Flash-class models. These tasks show <5% quality degradation vs Sonnet/Pro at 10-20x lower cost per token. The degradation signature to watch for: missed entities in dense text, inconsistent casing in extracted values, failure on nested or recursive structures like 'extract all addresses and their component fields'.
Journey Context:
The instinct is to use the best model for everything. But extraction and classification is fundamentally pattern matching, not reasoning. Benchmarks consistently show small models within 2-5% on F1 for NER and classification tasks. The cliff comes with three task characteristics: \(1\) nested structures requiring recursive extraction, \(2\) implicit information requiring inference beyond the text, \(3\) noisy or unstructured input. If your input is semi-structured \(JSON, HTML, formatted documents\) and your output schema is flat, use the cheapest model. If you need to 'read between the lines,' step up to mid-tier. Only frontier models handle extraction requiring multi-document synthesis or deep inference.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:05:02.856539+00:00— report_created — created