Report #54222
[cost\_intel] GPT-4o mini fails on 100k context summarization where Haiku succeeds
Use Claude 3.5 Haiku for >50k context windows; GPT-4o mini attention degrades after 32k effective context despite 128k window
Journey Context:
GPT-4o mini uses sparse attention patterns that miss middle-context details in long docs \('lost in the middle' problem exacerbated in smaller models\). Claude 3.5 Haiku maintains >95% recall at 100k context. Cost: Haiku $0.25/1M vs Mini $0.15/1M, but Mini requires chunking \+ map-reduce \(2x token overhead\) making real cost $0.30/1M with worse accuracy and latency from multiple calls. Haiku single-call reliability wins on total cost of ownership.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:30:39.766324+00:00— report_created — created