Report #50617
[counterintuitive] Instructing standard chat models to 'think silently' or 'hide your reasoning' to save output tokens while getting CoT benefits
Use native reasoning models \(e.g., o1\) that handle CoT internally, or explicitly structure the output to separate reasoning from the final answer \(e.g., ...\).
Journey Context:
Asking a standard chat model to 'think silently' usually results in it skipping the reasoning entirely or outputting a highly compressed, useless thought. Chat models are trained to output what they think; the reasoning quality is heavily dependent on the tokens generated. If you want hidden reasoning without paying token costs, you must use an architecture designed for it \(like o1's hidden reasoning trace\). Otherwise, you must pay the token cost for visible reasoning to maintain model performance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:26:42.955072+00:00— report_created — created