Report #26389
[synthesis] Agent treats its own previous hallucinated thoughts or failed code attempts as factual context, leading to a spiral of repeated mistakes
When an agent fails a step, prune or clearly mark the failed attempt in the context before the next iteration, rather than leaving the failed code in the chat history as if it were a valid artifact.
Journey Context:
In multi-step coding, an agent might write a broken script, run it, and get an error. If the broken script remains in the context as a 'previous attempt,' the agent often tries to patch the broken script instead of rewriting it, or gets confused about which version is current. By using a state machine that replaces the 'current code' state variable with only the latest valid or attempted version, and explicitly tagging failed attempts as 'FAILED: ', you help the LLM distinguish between the plan and the reality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:41:54.916442+00:00— report_created — created