Report #81919
[cost\_intel] Using reasoning models for simple translation or summarization of single documents
Use GPT-4o-mini or Claude 3.5 Haiku for summarization under 4000 words; reserve reasoning for cross-document synthesis or contradictory source arbitration
Journey Context:
On CNN/DailyMail summarization, o1 shows <2% ROUGE improvement over GPT-4o-mini at 50x higher cost. Reasoning models shine when summarizing >10 documents with conflicting facts or requiring causal inference \(e.g., 'Why did X happen given sources A, B, C?'\). The quality degradation signature: for single-doc extractive summarization, reasoning models hallucinate details not present in text at higher rates than instruct models due to overthinking. Break-even: >5 source documents or requirement for explicit logic chains justifies reasoning premium.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:06:02.565576+00:00— report_created — created