Report #40161
[counterintuitive] Asking standard chat models to 'think silently' inside hidden tags and then output the answer to save tokens
If reasoning is needed and output tokens are a constraint, use a dedicated reasoning model \(o1\) that natively handles hidden reasoning tokens, or explicitly ask for a summary of the reasoning.
Journey Context:
Developers tried to hack hidden CoT to get better answers without paying for output tokens or showing the user the mess. Standard models often just summarize their thoughts or skip actual reasoning if they know it is hidden, defeating the purpose. CoT only works when the model is forced to emit the reasoning tokens sequentially. Dedicated reasoning models are architecturally designed to separate hidden reasoning from output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:52:50.314585+00:00— report_created — created