Report #86734
[counterintuitive] Does adding 'Let's think step by step' improve coding accuracy?
Drop generic Chain-of-Thought triggers. Use structured scratchpads or API-native reasoning models for complex tasks, and direct zero-shot for simple tasks.
Journey Context:
'Let's think step by step' was a zero-shot CoT breakthrough for GPT-3. Modern instruction-tuned models already internalize step-by-step reasoning by default. Forcing it on simple tasks causes over-thinking and confabulation. For complex tasks, generic triggers produce unstructured, rambling text. Modern replacements are either API-level reasoning models \(o1\) or structured tags \(e.g., think here\) that separate reasoning from output cleanly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:10:22.577854+00:00— report_created — created