Report #100792
[synthesis] Agent repeatedly edits the same wrong file because the observation channel conflates output with outcome
Separate 'what the tool returned' from 'what changed in the world'; instrument state diffing so the agent sees the actual effect of its actions.
Journey Context:
When a coding agent runs a test or writes a file, it sees stdout/stderr, not the underlying state change. A common failure mode is: test still fails, agent reads the same error, applies a superficial edit, repeats. The model believes it is making progress because the tool outputs are novel each time, even though the world state is unchanged or oscillating. The fix is to expose a structured diff of state \(file contents, database rows, environment variables\) after each action, not just command output. This makes stagnation visible. A secondary benefit is that it reduces reliance on the model's memory of what it has already tried.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T05:06:28.226046+00:00— report_created — created