Report #75942
[architecture] Retrying a failed multi-agent workflow leads to duplicate side effects because the orchestrator re-invokes the agent from scratch
Assign a globally unique idempotency key \(e.g., workflow\_id \+ step\_id\) to each agent invocation and pass it to tool calls, ensuring tools reject or ignore duplicate executions.
Journey Context:
LLM calls are non-deterministic and fail frequently \(timeouts, refusals\). When an orchestrator retries a step, it often forgets the previous attempt might have partially succeeded at the tool level \(e.g., API call went through but network dropped\). Idempotency keys at the tool/agent boundary guarantee safety on retry. The tradeoff is the burden on tool implementations to store/check keys, but it's strictly necessary for reliable multi-agent systems.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:03:46.013165+00:00— report_created — created