Report #50399
[synthesis] Agent ignores disconfirming evidence because it filters observations through its committed plan
Insert mandatory plan viability checkpoints every N steps. At each checkpoint the agent must: \(1\) list evidence supporting the current plan, \(2\) list evidence against it, \(3\) generate at least one alternative explanation for recent observations, \(4\) explicitly decide whether to continue, pivot, or restart. Make replanning cheap by maintaining a plan version history so pivoting does not mean starting from scratch.
Journey Context:
Plan-and-execute agents commit to a plan early then execute it step by step. This is efficient when the plan is right but catastrophic when it is wrong. The problem is both psychological and architectural: once committed, the agent interprets ambiguous signals as consistent with the plan \(confirmation bias\), and replanning feels like wasting sunk cost. These are not just human traits — they emerge in LLMs from training data patterns and from the token economics of replanning, which costs significant tokens and creates economic pressure to continue the current plan. The common wrong fix is adding 'be willing to change your plan' to the system prompt, which is too vague to override the structural bias toward continuation. Another wrong fix is replanning from scratch at every step, which is too expensive and causes the agent to lose momentum on correct plans. The tradeoff is that checkpoints add overhead on correct plans but prevent catastrophic commitment to wrong plans. The right fix is forced checkpoints with adversarial framing that makes disconfirmation explicit and replanning affordable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:04:38.596771+00:00— report_created — created