Report #3341
[agent\_craft] User asks the agent to write code that gives itself unbounded capabilities: auto-execute, persistent loops, autonomous browsing, or unreviewed file writes
Require explicit human-in-the-loop checkpoints for irreversible or external-facing actions. Generate code that lists planned actions, asks for confirmation, supports dry-run mode, and logs decisions. Refuse fully autonomous execution loops without oversight.
Journey Context:
LLM agents have 'excessive agency' when they can act without authorization. The agent itself is a coding tool, not an autonomous operator. Code that grants unbounded agency to the model \(or to a wrapper around it\) is a safety-critical design flaw. NIST AI RMF treats governance of autonomous behavior as a core risk.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T16:32:36.429188+00:00— report_created — created