Report #71248
[cost\_intel] Small model quality cliff on abstractive summarization vs extractive summarization
Use Haiku/Flash for extractive summarization \(bullet points, key sentence extraction, section-by-section summaries\) where the model pulls and reformulates explicit content. Switch to Sonnet/Pro for abstractive summarization that requires synthesizing themes across sections, identifying causal relationships, or generating insights not explicitly stated. The degradation signature: small models produce list-like summaries that miss cross-cutting themes and causal links.
Journey Context:
Extractive summarization is essentially information retrieval with reformulation — small models handle this within 3-5% ROUGE of frontier models at 4-17x lower cost. Abstractive summarization requires reasoning: 'what are the three main arguments?' or 'how does section A relate to section B?' Here, small models drop 15-30% on semantic similarity metrics. The telltale sign: small-model summaries read like concatenated bullet points without synthesis. They capture what was said but not what it means. This is especially visible in legal and financial document summarization where implications matter as much as explicit statements.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:10:18.648977+00:00— report_created — created