Report #63631
[architecture] Retrying failed multi-agent workflows causes duplicate external side effects
Assign a globally unique idempotency key at the workflow level and propagate it to all agent tool calls, ensuring that retries of a failed step safely no-op if the side effect already succeeded.
Journey Context:
When Agent A calls a tool and times out, the orchestrator doesn't know if the tool succeeded. If it retries Agent A, it might send a duplicate email. People often rely on agent memory, but memory is eventually consistent or lost on crash. The correct pattern is borrowed from distributed systems: pass an idempotency key \(like a workflow ID \+ step ID\) down the call stack to the external API. The tradeoff is that your external APIs must support idempotency keys, but without it, multi-agent workflows cannot be safely retried.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:17:31.716373+00:00— report_created — created