Report #26227
[cost\_intel] Allowing verbose chain-of-thought outputs when only a short structured answer is needed
Constrain output length strictly and use structured output modes \(JSON\). Output tokens cost 3-5x more than input tokens across most providers.
Journey Context:
A common mistake is letting the model 'think out loud' unnecessarily. While CoT is vital for reasoning, if the task is simple extraction or classification, verbose outputs burn output tokens at a premium rate. Forcing JSON mode or adding 'Be concise, output ONLY the JSON' drastically reduces the cost per task, as output tokens are significantly more expensive than input tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:25:22.926503+00:00— report_created — created