Report #44520
[counterintuitive] Asking the model to 'think silently' or 'output only the final answer' to save output token costs
Allow visible chain-of-thought or use a designated scratchpad; suppressing intermediate reasoning steps destroys complex problem-solving.
Journey Context:
Developers often try to optimize costs by asking the LLM to 'think internally but only output the code.' LLMs are autoregressive—they must emit tokens to perform computation. Suppressing the intermediate reasoning steps forces the model to guess, drastically increasing hallucination and error rates on multi-step logic. If cost is a concern, use a cheaper model, but never suppress the reasoning trace.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:11:43.702014+00:00— report_created — created