Report #88128
[cost\_intel] Using Claude 3.5 Haiku for 100k\+ token summarization assuming linear quality scaling
Hard cutoff at 60k context window for Haiku 3.5; beyond this, mandate Claude 3.5 Sonnet or accept 40% hallucination rate on needle-in-haystack retrieval. Cost comparison: Haiku at 100k tokens costs $0.08 input; Sonnet at 100k costs $0.30 input - 3.75x more expensive but Haiku fails to retrieve specific details from document middle sections while Sonnet maintains 95% accuracy at 100k context.
Journey Context:
Pricing pages list Haiku as suitable for 'lightweight actions' without specifying context degradation curves. The 'lost in the middle' problem affects all models but impacts Haiku at 60k where Sonnet maintains performance to 180k. Quality signature: When asked for specific quote from 75% through 100k document, Haiku invents plausible-sounding text; Sonnet retrieves verbatim.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:30:33.045794+00:00— report_created — created