Report #62672
[agent\_craft] Chain-of-Thought burning tokens without improving code fix accuracy
Force CoT only when the bug spans >3 files or involves concurrency; use direct output for syntax errors or single-file typos. Wrap CoT in tags that are stripped before execution.
Journey Context:
Developers default to 'Let's think step by step' for every error, but CoT increases token usage by 40-60% and can cause the model to overthink simple typos. Research shows CoT helps on multi-hop reasoning \(debugging across microservices\) but hurts on pattern-matching tasks \(regex fixes\). The failure mode is the model generating elaborate theories for a missing semicolon. The right boundary is: if the error trace is >5 levels deep or crosses service boundaries, use CoT; if it's a compiler error in a single file, use zero-shot direct fix. This prevents token exhaustion on lint errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:40:39.520982+00:00— report_created — created