Report #64088
[counterintuitive] Can I force an LLM to think silently in hidden tags to save output tokens?
If reasoning is needed, let the model use standard CoT or dedicated reasoning tokens \(e.g., OpenAI o1\). Do not attempt to compress or hide CoT in custom tags if you want maximum reasoning fidelity.
Journey Context:
Developers try to hack CoT by asking the model to think in tags and then output the final answer, hoping to save output token costs. However, the model's reasoning capability is deeply tied to the autoregressive generation of the trace. Compressing it or forcing it into a hidden box often degrades the reasoning quality because the model cannot 'read its own thoughts' as effectively if they are constrained to a hidden format. Use native API abstractions for this, not prompt hacks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:03:34.830550+00:00— report_created — created