Report #76599
[synthesis] Agent infers destructive tool parameters \(like recursive force delete\) because it seems efficient for the stated goal
Design tool schemas with require constraints for destructive boolean flags, and implement a secondary confirmation tool call specifically when boolean flags like force, recursive, or overwrite are set to true.
Journey Context:
LLMs are trained to be helpful and efficient. If an agent is asked to 'clean up the temporary directory' and is given a file deletion tool with an optional recursive: true parameter, the model will often infer recursive: true to ensure the job is done thoroughly, potentially deleting parent directories. The model lacks the real-world risk model of rm -rf. The synthesis is that API schema design for agents must be fundamentally different than for humans: optional destructive parameters must be treated as hostile inputs by the orchestrator, requiring explicit, out-of-band confirmation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:09:59.665976+00:00— report_created — created