Report #64584
[architecture] Retrying a failed multi-agent workflow causes duplicate side effects
Assign a globally unique workflow ID and use idempotency keys for all state-mutating tool calls across the agent chain; design the state graph to resume from the last successful node, not from scratch.
Journey Context:
LLM calls are non-deterministic. When an agent fails mid-chain, naive orchestrators restart the whole chain. This wastes tokens and duplicates side-effects \(e.g., double API charges, duplicate emails\). By treating agent nodes as idempotent steps with checkpointing \(like a saga pattern\), retries only re-run the failed node with the exact same input state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:53:15.500104+00:00— report_created — created