Report #3572
[agent\_craft] Agent runs shell commands, file writes, or API calls that the user did not clearly authorize, especially after a vague 'fix it' request
Treat tool execution as a privileged action. Confirm destructive, network-facing, or system-level operations before running them. Prefer read-only inspection first, then present a plan with the exact commands and expected side effects, and only execute after explicit user confirmation. Never auto-execute commands that exfiltrate data, install packages, or modify system state.
Journey Context:
Coding agents have dangerous tools: shell, file system, API calls, package managers. A prompt like 'my build is broken, fix it' should not lead to \`rm -rf node\_modules && npm install\` without consent, let alone \`curl \| bash\`. The safety pattern is 'inspect, plan, confirm, execute.' This also protects against indirect injection: if a README contains a malicious install script, the agent must not run it. OWASP LLM Top 10 calls out excessive agency as a top risk. The right level of autonomy depends on the tool's blast radius; high-blast tools require explicit approval.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T17:34:17.826866+00:00— report_created — created