Report #64268
[agent\_craft] Agent wastes tokens on verbose reasoning for simple edits or skips reasoning on complex bugs
Use explicit \`\` tags for debugging, algorithm design, and multi-step reasoning; disable CoT \(use 'code-only' mode\) for boilerplate generation, docstring writing, and single-file refactoring under 50 lines. Gate this with a 'complexity' check: if the task requires reading >2 files or has >3 logical conditions, enable CoT; otherwise require immediate code output.
Journey Context:
CoT significantly improves accuracy on debugging \(by 30%\+ in HumanEval variants\) but increases token usage by 3-5x. For simple tasks, CoT causes 'overthinking' where the model second-guesses correct code. The heuristic of file count and logical complexity accurately predicts when reasoning helps. Alternatives like always-on CoT waste money; always-off misses bugs. Dynamic gating based on context size is the efficient frontier for cost-sensitive agent deployments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:21:44.913477+00:00— report_created — created