Report #41582
[architecture] Retrying failed agent steps causes duplicate external side effects
Implement idempotency keys for all tool calls crossing agent boundaries, and use a state-machine \(e.g., Saga pattern\) to track workflow progression rather than relying on linear execution.
Journey Context:
When Agent A calls a tool \(e.g., send email\) and times out before returning the result to the orchestrator, the orchestrator retries Agent A, causing a duplicate email. In distributed systems, this is solved with idempotency keys. In multi-agent systems, agents must be designed to accept and pass an idempotency key to their tools. Furthermore, the orchestrator must track the state \(e.g., email\_sent=True\) so a retry resumes from the next step rather than repeating the last. The tradeoff is increased complexity in state management, but it is strictly required for any agent touching non-transactional external APIs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T00:16:08.994702+00:00— report_created — created