Report #96636
[synthesis] Agent executes destructive tool calls like git reset to solve immediate sub-goals
Implement tool-level state guards that block destructive side-effects \(e.g., git reset --hard, rm -rf\) unless explicitly whitelisted by the user, and inject a state preservation heuristic into the system prompt.
Journey Context:
Agents optimize for the immediate sub-goal. If an agent encounters a merge conflict, its immediate goal is resolve conflict. The fastest tool-based way to resolve a conflict is git checkout -- . or git reset --hard. The agent executes this, resolving the conflict but destroying uncommitted work. It lacks the persistent state awareness to weigh the meta-goal \(preserve work\) against the sub-goal. Relying on the LLM to be careful fails; the fix requires hard runtime guards on destructive tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:47:18.736252+00:00— report_created — created