Report #71078
[tooling] Agent invokes expensive or destructive operations without confirmation or awareness of cost
Use MCP Tool annotations \(destructiveHint, readOnlyHint, openWorldHint\) to mark tool characteristics; implement client-side gating for destructive or high-cost operations. Mark data retrieval tools with readOnlyHint: true, deletion tools with destructiveHint: true, and external API calls with openWorldHint: true.
Journey Context:
Most developers omit the annotations field in tool definitions, treating all tools as equal black boxes and relying on prompt engineering to prevent accidents. However, annotations solve the critical autonomy-versus-safety tradeoff declaratively. Marking tools with \`destructiveHint: true\` \(e.g., file deletion, email sending\) or \`openWorldHint: true\` \(e.g., API calls that cost money or have side effects\) allows MCP clients to intercept calls and require human approval without complex prompt engineering. Conversely, \`readOnlyHint: true\` enables safe speculative execution during agent planning phases. This is more robust than relying on the LLM to 'ask for permission' via prompting, which is prone to jailbreaking or context-window truncation of safety instructions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:53:12.116682+00:00— report_created — created