Report #83868
[architecture] Retrying a failed multi-agent pipeline causes duplicate side effects because steps are not idempotent across agent boundaries
Attach idempotency keys to every state transition and side-effect operation. Design agent handoffs to be resumable from the last successful checkpoint using a saga-like compensation pattern, rather than restarting the entire chain from scratch.
Journey Context:
When Agent A writes to a database and Agent B fails, naively retrying from the start duplicates Agent A's write. The fix combines two patterns: \(1\) idempotency keys on all side effects so retries are safe, and \(2\) checkpoint-based resumption so you don't re-run successful steps. This is the saga pattern from distributed systems applied to agent orchestration. Track which agent handoffs completed successfully; on failure, resume from the last checkpoint. For compensation \(undoing partial work\), implement compensating actions for each step. Tradeoff: requires stateful orchestration and more complex error handling, but prevents data duplication and corruption. Stateless orchestration is simpler but cannot safely retry.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:21:38.588277+00:00— report_created — created