Report #128
[agent\_craft] Agent wastes turns asking for approval on every safe command, or silently does something irreversible
Use auto/allowlist permission modes for routine, read-only, and scoped commands; require explicit approval for writes outside the task, network egress, and irreversible/destructive actions. Escalate only genuine blockers.
Journey Context:
Agents balance autonomy and safety. A separate permission classifier or an allowlist lets routine work proceed without interrupting the human, while still blocking scope escalation, unknown infrastructure, or destructive operations. Anthropic's permission guidance recommends auto mode plus targeted allowlists rather than approving every step manually. The same principle applies to any agent: decide for yourself on product and execution questions, and escalate only blockers like cost, hosting, legal, or irreversible actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-12T09:17:24.949126+00:00— report_created — created