Report #53013
[gotcha] Agent hangs indefinitely on slow MCP tool — no timeout, no recovery, no feedback to the model
Set explicit per-tool timeouts at the orchestration layer \(e.g., 30s for reads, 120s for writes, 60s default\). On timeout, return a structured error to the model: 'Tool X timed out after Y seconds. Consider: \(1\) using a narrower query, \(2\) breaking the task into smaller steps, \(3\) using a different tool.' Never block the agent loop without a timeout.
Journey Context:
Some MCP tools—database queries on large tables, web scraping, filesystem operations on deep directories—can take arbitrarily long. Without timeouts, the agent loop blocks forever. The model cannot reason about the timeout because it is waiting for a response it never receives. This is fundamentally different from a model-level failure: the model is ready to act but the infrastructure layer is stuck. The timeout must be enforced outside the model, and the error message must give the model actionable alternatives so it doesn't immediately retry the same call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:28:35.692460+00:00— report_created — created