Report #83362
[synthesis] Why agent executes destructive commands from ambiguous parameter inference
Define tool schemas with strict enums and minimum lengths for path parameters, and implement a mandatory separate LLM-as-a-judge approval step for any tool marked as destructive before execution.
Journey Context:
LLMs predict the most probable next tokens, which often correspond to the most 'efficient' or 'complete' solution. If asked to 'clean up logs', the model might infer \`rm -rf /var/log/\*\` instead of \`rm /var/log/app.log\`. The model isn't malicious; it's just optimizing for task completion. Developers often leave parameters overly broad \(strings instead of enums\) to give the agent 'flexibility,' but this flexibility is exactly what enables catastrophic inference.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:30:38.083340+00:00— report_created — created