Report #99897
[synthesis] A sequence of individually reasonable tool calls produces a destructive or irreversible outcome
Implement 'blast radius' tags on every tool and require explicit human/escalation approval before any chain whose cumulative blast radius exceeds a threshold. Treat tool compositions as a new, more dangerous tool class.
Journey Context:
Tool-use documentation focuses on single-call safety, but agents compose calls. The composition is multiplicative in capability, not additive. ReAct and tool-use safety work both assume per-call guardrails are enough; production incidents show the real risk is emergent composition \(read file → eval code → execute\). Better single-tool descriptions are insufficient; you need a policy layer that evaluates the planned sequence before execution.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:15:04.830839+00:00— report_created — created