Report #96934
[cost\_intel] Using frontier models for all entity extraction tasks regardless of complexity
Route flat, single-type entity extraction \(e.g., names, dates, standard PII\) to Haiku/Flash; reserve Sonnet/Pro for nested, discontinuous, or highly ambiguous entities.
Journey Context:
Smaller models match frontier models within 1-2% on flat NER tasks at 1/20th the cost. However, their quality falls off a cliff \(10-15% F1 drop\) on nested entities \(e.g., 'The \[Department of \[Computer Science\]\]'\) because they lack the deep semantic parsing to resolve overlapping spans. People over-provision frontier models out of habit for simple regex-adjacent extraction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:17:16.032626+00:00— report_created — created