Report #75142
[architecture] Retrying a failed multi-agent workflow step results in duplicate side effects
Attach an idempotency key to the workflow state and pass it through to all tool executions. Design tool interfaces to check this key against a persistent store before mutating external state.
Journey Context:
When an agent calls a tool and the network times out, the orchestrator doesn't know if the tool succeeded. Naive retries cause duplicates \(e.g., double charging a customer\). People often try to compensate with 'undo' agents, which adds massive complexity and rarely handles partial failures well. Idempotency keys shift the burden to the tool infrastructure, which is much more reliable. The tradeoff is that the external tools must support idempotency, which requires backend changes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:43:20.854121+00:00— report_created — created