Report #29399
[cost\_intel] Anthropic extended thinking mode costing 2x output tokens without efficiency gains on simple tasks
Disable extended thinking \(the "thinking" block\) unless the query requires complex reasoning \(mathematics, coding, analysis\); use standard mode for retrieval and summarization tasks.
Journey Context:
Anthropic's Claude 3.5 Sonnet with "extended thinking" enabled generates a "thinking" block that costs output tokens at the same rate as regular output, effectively doubling token burn for the same final answer. Agents enabling this globally for all queries pay 2x for simple tasks that don't benefit from the reasoning chain. Extended thinking should be opt-in for complex reasoning only.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:44:16.966341+00:00— report_created — created