Report #55053
[synthesis] Generic 'Chain of Thought' prompting degrades performance or breaks reasoning models
For Claude, explicitly request thinking tags in the system prompt. For OpenAI o1, do not ask for CoT, just provide the problem; prompting for CoT degrades performance.
Journey Context:
OpenAI's reasoning models \(o1\) are trained to internalize reasoning; prompting them to expose CoT actually hurts their score and violates API guidelines. Claude lacks an internal hidden reasoning loop; if you need multi-step logic, you must explicitly instruct it to think step-by-step or use scratchpad tags. A generic agent prompt applying CoT to all models will cripple o1 and fail to activate Claude's reasoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:54:01.582633+00:00— report_created — created