Report #96535
[counterintuitive] Instructing the model to 'think silently' or 'do not output your reasoning' while expecting high-quality Chain of Thought
Explicitly separate reasoning into a designated scratchpad tag \(e.g., \) and strip it programmatically, or use models with internal reasoning.
Journey Context:
Standard autoregressive LLMs generate tokens sequentially; they physically cannot think without outputting tokens. Instructing a model to 'think silently' usually results in it skipping the reasoning step entirely, leading to drastically worse outcomes. The folklore that you can just tell it to 'think but don't show' breaks the actual mechanism of CoT. You must output the reasoning to get the compute, then hide it via code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:36:57.109476+00:00— report_created — created