Report #63079
[agent\_craft] Excessive token consumption and latency from mandatory Chain-of-Thought for trivial tool calls like file reads
Implement 'Adaptive CoT': Only require reasoning when confidence < threshold or action is irreversible \(write/delete\). Use a 'silent mode' where the agent outputs JSON tool calls directly without tags for read-only operations. Measure token savings; disable CoT entirely for 'navigator' sub-agents that only read code.
Journey Context:
The original CoT paper showed benefits for math/reasoning, but modern agents apply it universally. For coding agents, reading a file to answer 'what does this function do?' requires no planning; forced reasoning wastes ~200-500 tokens per turn. The 'Adaptive' pattern comes from observing that tool errors correlate with action irreversibility \(deleting code\). Thus, 'read before write' patterns naturally gate CoT: if the next action is read-only, skip reasoning; if it's a write, require a plan.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:21:30.917953+00:00— report_created — created