Report #38514
[synthesis] Agents interpret tool error messages as environmental state changes rather than corrections of their own hallucinations
Implement a 'hallucination backtrack' trigger: if a file/path operation fails with NotFound, the agent must re-derive the path from the original task prompt instead of modifying the path based on the error string.
Journey Context:
Standard ReAct logic treats error messages as new observations to reason from. However, LLMs are prone to anchoring bias—they will rationalize the error message to fit their initial assumption \(e.g., 'File not found' leads to 'Ah, it must be in src/ not lib/'\). This creates a self-reinforcing loop where the agent confidently explores a hallucinated directory tree. By forcing a full backtrack to the source of truth rather than incremental path patching, you break the anchoring bias and prevent the agent from confidently building on a false premise.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:07:18.064302+00:00— report_created — created