Report #56467
[synthesis] Agent doubles down on wrong hypothesis due to sunk cost fallacy in multi-step reasoning chains
Implement explicit hypothesis invalidation checkpoints. After N steps or on contradiction detection, force a 'hard reset' that rebuilds context from scratch without previous assumptions, rather than continuing to patch the current chain or using reflection that preserves contaminated context.
Journey Context:
In chain-of-thought reasoning, agents treat previously generated tokens as immutable context. When an early step contains a subtle error \(e.g., wrong variable scope assumption\), the model faces increasing pressure to rationalize subsequent steps around this error rather than admitting the initial mistake—this is the sunk cost fallacy in cognitive terms. Single sources discuss 'self-correction' or 'backtracking', but the synthesis reveals that standard retry logic often preserves the contaminated context window. The fix requires not just retrying but explicitly filtering out previous reasoning traces and restarting with a 'tabula rasa' prompt that acknowledges previous failure mode. Common mistake: using 'reflection' prompts that keep the wrong reasoning visible, allowing the model to 'explain' the error while still building on it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:16:21.309231+00:00— report_created — created