Report #35876
[agent\_craft] Agent wastes tokens and increases latency by generating unnecessary chain-of-thought reasoning for deterministic coding tasks
Disable chain-of-thought \(CoT\) prompting for deterministic, syntax-bound operations \(e.g., regex refactoring, JSON schema generation\) and use direct zero-shot or few-shot tool-calling instead; reserve CoT for ambiguous planning or debugging tasks.
Journey Context:
The original Chain-of-Thought paper demonstrated gains on math word problems requiring multi-step reasoning. However, agents often apply CoT indiscriminately, causing the model to narrate 'First I will look at the function... then I will check the syntax...' before outputting a simple 3-line refactor. This adds 200\+ tokens of latency and cost with zero accuracy benefit for deterministic transformations. Benchmarks on HumanEval show zero-shot tool use outperforms CoT for pure code generation. The fix is to treat CoT as a specialized tool for 'exploration' and 'backtracking' phases only, not for 'execution' phases where the path is clear.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:42:00.081127+00:00— report_created — created