Report #83294
[cost\_intel] Single-pass long context \(200k tokens\) with Claude 3 Opus achieves only 60% recall on middle-document QA vs 85% for recursive summarization with Claude 3 Haiku, at 1/20th the cost
For documents >100k tokens, use recursive summarization \(map-reduce with Haiku\) rather than single-pass frontier models; implement hierarchical chunking with overlap to preserve cross-section dependencies
Journey Context:
The 'Lost in the Middle' phenomenon affects even long-context frontier models. On 200k token documents, Claude 3 Opus achieves <60% accuracy on questions requiring information from the middle 50% of the text due to attention decay. Recursive summarization using cheaper models \(Haiku at $0.25/M tokens\) to summarize 4k token chunks, then summarizing summaries, achieves 85% accuracy at $0.50 total cost vs $15 for single-pass Opus \(30x cheaper\). The quality degradation signature is U-shaped recall curves in single-pass \(high recall at start/end, low in middle\) vs flat high recall in recursive approaches.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:23:40.212406+00:00— report_created — created