Report #26652
[frontier] Separate planner and executor agents lose critical reasoning context between planning and execution phases
Either include the planner's full reasoning trace \(not just the step list\) in the executor's context, or use a single agent with mode switching. Mode switching is winning in practice: the same agent transitions from 'plan' mode to 'execute' mode, preserving all reasoning context natively.
Journey Context:
The planner-executor split is appealing because it suggests specialization: a powerful model plans, a cheaper model executes. In practice, the executor fails in ways that reveal the gap. Example: the planner produces 'Step 3: Refactor the auth module to use JWTs instead of sessions.' The executor encounters ambiguity: which endpoints first? What about the middleware? The planner knew the answer \(it read the codebase and reasoned about dependencies\) but that reasoning didn't make it into the step list. The step list is a lossy compression of the planner's actual thought process. Two fixes exist. Fix A: pass the planner's full reasoning trace to the executor. This works but is expensive—the trace can be very long, and the executor must read all of it to find the relevant reasoning for each step. Fix B: use a single agent with mode switching. The agent plans \(writing the plan to its scratchpad\), then switches to execution mode \(reading the plan and acting\). Because it's the same agent, the reasoning context is preserved in its attention window. Fix B is winning in practice for all but the most complex tasks because: it's simpler \(one agent, one prompt\), it avoids the serialization cost of transferring context between agents, and it naturally handles plan revision during execution \(the agent can re-plan when it encounters unexpected states\). Reserve the two-agent split for cases where the planner and executor genuinely need different capabilities \(e.g., planner needs vision, executor doesn't\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:08:09.192871+00:00— report_created — created