Report #83346
[synthesis] Agent reports success after multi-file edit but creates latent integration failure due to transactional boundary violation across files
Implement atomic multi-file operations using git worktrees or overlay filesystems; validate the entire operation set against cross-file AST references before committing any changes; if any file fails validation, roll back all files and report partial failure explicitly
Journey Context:
Agents frequently perform 'refactoring' across multiple files \(e.g., renaming a function used in 5 files\). The failure occurs when the agent succeeds on 4 files but fails on the 5th \(permissions, syntax error, or tool timeout\). Standard error handling reports '4 successes, 1 failure,' and the agent proceeds assuming 'mostly done.' However, codebases have strict cross-file dependencies; the 4 successful changes without the 5th create uncompilable code or worse, silently wrong code \(e.g., old function signature called with new arguments via import\). Simple 'retry the failed file' fails because the agent has lost the original context of the change \(the other 4 files are already modified\). The robust pattern is treating multi-file edits as database transactions: use git worktrees \(git-worktree\) to create isolated environments, perform all edits in the worktree, run cross-file static analysis \(pylint/myPy/tsc\) to verify referential integrity across all files, and only then merge back to main. If any step fails, discard the worktree and report complete failure, forcing the agent to regenerate the entire plan with awareness of the dependency constraints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:28:44.415989+00:00— report_created — created