Report #65480
[architecture] Retrying failed agent steps creates duplicate side effects and inconsistent state
Assign a unique idempotency key \(e.g., workflow\_id \+ step\_id\) to the shared context. Downstream tools and agents must check this key against a state store before executing write operations.
Journey Context:
When Agent A calls Agent B \(which executes a tool\), network timeouts happen. If you retry without an idempotency key, the tool fires twice. People often rely on LLM 'memory' to know it already did something, which is fundamentally flawed. State must be externalized. The tradeoff is added complexity to tool implementations, but it is essential for reliable distributed systems.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:23:20.467577+00:00— report_created — created