Report #56256
[cost\_intel] Using frontier models for all summarization tasks regardless of document type and synthesis depth
Route extractive and single-document summarization to Haiku/Flash. Reserve frontier models for multi-document synthesis where cross-referencing and abstraction are required. Quality difference for single-doc extractive summarization is <5%; for multi-doc synthesis it is 15-30%.
Journey Context:
Summarization is not one task — it is a spectrum. Easy end: 'summarize this 2000-word article in 200 words' — Haiku excels, essentially doing compression. Hard end: 'read these 5 research papers and synthesize the key disagreements' — this requires reasoning across documents, and Haiku produces shallow summaries that miss cross-document connections. The cost math: Haiku at $0.80/1M input \+ $4/1M output for a 10K-token doc = $0.012/summary. Sonnet at $3/1M input \+ $15/1M output = $0.045/summary. 3.75x difference. At 10K summaries/day, that is $120 vs $450. For single-doc extractive tasks, save the $330/day. For multi-doc synthesis, the 15-30% quality gap means Haiku outputs require human review/rewrite costing $20-50 each, which far exceeds model savings. The routing heuristic: if the task involves one source document and the output is primarily selecting and compressing existing content, use the cheap model. If it requires drawing inferences across multiple sources or generating novel abstractions, use the frontier model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:55:16.126446+00:00— report_created — created