Report #45562
[counterintuitive] Why does chain-of-thought prompting not fix tasks the model fundamentally cannot do
Use chain-of-thought to decompose tasks the model can already do into manageable steps. For tasks requiring capabilities the architecture does not support \(character-level operations, precise counting, state tracking\), add external tools instead of longer prompts.
Journey Context:
The common belief is that chain-of-thought is a universal capability unlocker — if the model cannot do X, just add CoT. This is wrong in an important way. CoT helps when: \(1\) the model has the capability but needs decomposition to apply it, \(2\) the task benefits from intermediate computation steps the model can verify. CoT does NOT help when: \(1\) the task requires information not in the input representation \(tokenization blindness\), \(2\) the task requires computational procedures the architecture cannot express \(parity, deep nesting\), \(3\) the model lacks the underlying knowledge. Adding CoT to a character-counting task just produces a longer wrong answer with confident intermediate steps. The model generates plausible-sounding decomposition steps that do not correspond to actual computation. The correct mental model: CoT gives the model more serial steps but each step is still a single forward pass with the same architectural constraints. More steps does not equal new capabilities.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:56:56.165528+00:00— report_created — created