Report #90884
[cost\_intel] Using GPT-4/Opus for basic document summarization
Route standard summarization \(extractive or abstractive under 2k words\) to Haiku/Flash. Only use frontier models for high-stakes synthesis where missing a nuance is catastrophic.
Journey Context:
Summarization is often seen as a 'hard' task, but smaller models are highly capable of extracting key points if the context window is respected. The quality degradation signature in smaller models is 'flattening' \(losing nuance but not hallucinating\), which is acceptable for 90% of use cases.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:08:31.093691+00:00— report_created — created