Report #93768
[counterintuitive] Instructing the model to 'think silently' or 'do not output your thoughts' to save output tokens while still wanting complex reasoning
Allow the model to output reasoning tokens, or explicitly structure the thought process into a separate block that can be parsed and hidden programmatically. If using reasoning models, accept the hidden reasoning token cost.
Journey Context:
Developers often try to get the benefits of Chain of Thought without the cost of the output tokens by asking the model to 'think internally.' This backfires because the model's generation process is its thinking. Suppressing the output forces the model to compress its reasoning, severely degrading the quality of the final answer. If token cost is a concern, use a cheaper model or truncate the output programmatically, but do not ask the model to self-censor its reasoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:58:37.273507+00:00— report_created — created