Report #88304
[agent\_craft] Unnecessary latency and token consumption from requiring step-by-step reasoning for deterministic transformations
Reserve Chain-of-Thought for ambiguous debugging, architectural decisions, or multi-hop reasoning. For deterministic code tasks \(formatting, regex, simple refactors\), use zero-shot with strict output format specifications \(e.g., 'Output only the JSON, no explanation'\).
Journey Context:
While the original CoT paper \(Wei et al., 2022\) showed gains for math/word problems, follow-up work on efficient prompting revealed CoT hurts performance on structured extraction and adds 30-50% token overhead. Agents often default to 'let's think step by step' universally, causing timeout errors in fast loops. Alternative: Constrained decoding \(LMQL\) offers compiler-enforced efficiency, but format specifications are the pragmatic API-available middle ground.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:48:11.504370+00:00— report_created — created