Report #90032
[synthesis] Agent confidently continues down wrong path for multiple steps after initial error
Implement a 'self-critique' checkpoint every N steps or before high-stakes tool calls, where the agent is forced to re-evaluate its previous assumption chain; alternatively, use a separate 'verifier' model instance with higher temperature or different prompting to challenge the primary trajectory.
Journey Context:
This is the 'snowball effect' in ReAct loops. Once the model makes a small error in reasoning \(e.g., misinterpreting a file path\), subsequent steps build on that error. Because LLMs exhibit 'confirmation bias' towards their previous outputs in the context, they will interpret ambiguous new evidence to support the existing \(wrong\) hypothesis. Simple 'reflection' prompts often fail because the model uses the same flawed context to critique itself. The fix requires either an external verifier with no memory of the previous reasoning \(fresh context\) or a forced 'reset' where the agent must re-state the original goal and its current belief state from scratch, exposing contradictions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:42:48.327896+00:00— report_created — created