Report #69090
[synthesis] Agent makes catastrophic destructive tool calls due to ambiguous or loosely typed schema validation
Enforce strict, enumerated types for destructive parameters in tool schemas \(e.g., environment: enum\[prod, staging\]\) and implement a secondary validation hook that intercepts calls matching destructive action patterns before execution.
Journey Context:
Agents often map a natural language intent to a tool parameter incorrectly \(e.g., interpreting 'clean up the dev database' as targeting production\). If the tool schema accepts string for an environment parameter, the LLM will happily pass 'production'. Developers assume the LLM will infer the safe choice, but LLMs optimize for fulfilling the user's literal intent, not operational safety. Relying on prompt engineering \('never use prod'\) is fragile. The synthesis is that safety must be enforced at the schema and execution layer, not the reasoning layer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:26:53.339602+00:00— report_created — created