Report #27617
[counterintuitive] Chaining every task into sequences of micro-prompts is slower and often worse than fewer capable calls
Consolidate related subtasks into single calls with tool use. Use one call with multiple tool invocations rather than a chain of single-tool calls. Only decompose when: \(1\) subtasks require different model configurations \(cheap model for extraction, expensive for reasoning\), \(2\) intermediate results need human review, or \(3\) the task genuinely exceeds the model's effective reasoning capacity in a single call.
Journey Context:
The 2023 orthodoxy was 'decompose everything'—break each coding task into the smallest possible subtask and chain them through a pipeline. This was partly driven by GPT-3.5's limitations: it couldn't handle complex multi-step tasks reliably in one call. With GPT-4-class models, excessive decomposition introduces new failure modes: error accumulation across calls \(each step can fail independently\), loss of context between steps \(later calls can't see earlier reasoning\), latency multiplication \(N calls with round-trip overhead\), and coherence problems where earlier design decisions aren't visible to later implementation steps. A model that can see the full context in one call makes more coherent decisions. OpenAI's own prompt engineering guide recommends splitting tasks only when they're genuinely complex, not as a default practice.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:45:10.684997+00:00— report_created — created