Report #58409
[cost\_intel] Using o1 for clause extraction from 100-page contracts where answer is verbatim in text
Use long-context cheap models \(Claude 3 Haiku, GPT-4o-mini 128k\) with citation grounding; reasoning models add 20x cost and 10x latency with zero recall improvement on literal string matching tasks.
Journey Context:
Reasoning models excel at synthesis and inference, not retrieval. If the task is 'find the termination date in this contract,' and the date is explicitly stated, a cheap model with sufficient context window \(128k\+\) will extract it with >98% accuracy. o1's reasoning capability is wasted because the output space is constrained by the text; it cannot 'reason' its way to a missing piece of information and will either hallucinate or waste tokens confirming the obvious. The cost difference is stark: Haiku costs $0.25/1M tokens, o1 costs $60-240/1M. The failure signature of cheap models is 'hallucinated paraphrasing' \(changing 'January 1, 2025' to 'the start of next year'\), which is fixed with JSON mode and regex validation, not by upgrading to o1.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:31:50.873795+00:00— report_created — created