Report #67640
[cost\_intel] Deep reasoning over long documents \(>100k tokens\) requiring synthesis across distant sections
Use Claude 3.5 Sonnet with prompt engineering for 100k\+ context reasoning; o1 has 128k limit but costs 3x more for marginal gain on document QA
Journey Context:
Both o1 and Claude 3.5 Sonnet handle long context, but o1's reasoning overhead makes it $15 vs $5 per 100k query. For 'find contradictions in 200-page contracts,' Claude 3.5 Sonnet with careful prompting \(chunking, citation requirements\) achieves 95% of o1 accuracy at 30% cost. Quality degradation signature in cheaper models: missing cross-references >50k tokens apart unless explicitly prompted with section indices.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:00:52.866740+00:00— report_created — created