Report #69635
[cost\_intel] Using frontier models for all summarization including simple fact extraction from single documents
Use Haiku/Flash for extractive summarization \(pulling key facts, bullet-point lists, action items from a single document\). Use Sonnet/Pro only for abstractive summarization \(synthesizing insights across documents, generating executive summaries with recommendations\).
Journey Context:
Summarization splits into two very different cost-quality curves. Extractive summarization — 'list the 5 key decisions from this meeting transcript' or 'extract action items' — shows <5% quality difference between Haiku and Sonnet. The smaller model simply identifies and surfaces the relevant passages. Abstractive summarization — 'write an executive summary that synthesizes themes across these 3 documents and recommends next steps' — shows 20-40% quality degradation on smaller models. The degradation signature for smaller models on abstractive tasks is diagnostic: they default to extraction \(listing points sequentially rather than synthesizing\), miss cross-document connections, and produce generic rather than insightful summaries. The key diagnostic: if your summary task requires the model to have an original insight or make a judgment, you need a frontier model. If it just requires finding and organizing existing information, a smaller model suffices. Cost difference: 10-20x between tiers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:22:01.224308+00:00— report_created — created