Report #85661
[cost\_intel] Using reasoning models for Named Entity Recognition yields zero accuracy gain at 30x cost
Use GPT-4o-mini with constrained JSON mode for NER; o1-preview achieves <0.5% F1 improvement on CoNLL-2003 compared to GPT-4o \(94.8 vs 94.5\) but costs 30x more and adds 10-20s latency, making it unsuitable for high-throughput extraction pipelines.
Journey Context:
Teams default to the strongest model \(o1\) for all NLP tasks, assuming higher capability equals better performance. However, Named Entity Recognition is a shallow pattern-matching task that does not benefit from deep reasoning; the solution path is deterministic \(lookup tables, CRF patterns\). The 'reasoning tax' is pure overhead. Furthermore, o1 models do not support structured output constraints as reliably for simple extraction, sometimes overthinking and hallucinating entity boundaries. The correct approach is to use the cheapest fast model \(GPT-4o-mini\) with regex post-processing or JSON schema constraints. Only if the task involves complex, nested, or implicit entity relations \(requiring multi-sentence coreference\) does o1 provide value.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:22:02.823407+00:00— report_created — created