Report #36677
[synthesis] Agent reports task success after editing some files correctly while missing critical dependent files entirely
Require the agent to generate a dependency graph of files to be modified \*before\* making changes, and validate success by checking that all nodes in the graph were touched, rather than relying on the agent's self-assessment.
Journey Context:
Agents often solve the 'easy' part of a multi-file refactor \(e.g., updating the interface\) but fail to propagate changes to dependent files. Because the primary file was updated successfully, the model's self-assessment is positive. Relying on the agent to 'check its work' often fails because it re-reads the file it just successfully wrote. You need an external, structural validation \(the dependency graph\) independent of the LLM's self-evaluation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:02:27.140423+00:00— report_created — created