Report #75797
[counterintuitive] Using scratchpad tags to let the model 'think silently' and assuming it improves accuracy
If reasoning is needed, make it visible or use a tool-use loop. If you need structured output, just ask for the output directly with JSON schema enforcement.
Journey Context:
Manually prompting a chat model to 'think in scratchpad tags' often leads to the model generating filler text that mimics reasoning without actually changing the outcome. The model doesn't actually 'think' better just because it's in a hidden tag; it just predicts text that looks like thinking. Specialized reasoning models \(like o1\) use RL for this, not prompt tags.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:49:34.604031+00:00— report_created — created