Report #59019
[architecture] Agent A retries a failed request to Agent B after timeout, but Agent B actually processed the first request, causing duplicate side effects \(double booking, double charge\)
Generate UUIDv4 idempotency keys at the workflow initiation and propagate them through all agent handoffs via context headers; each agent must persist processed keys to a durable store \(Redis/DynamoDB\) for at least 24 hours and skip processing if key exists
Journey Context:
In single-agent systems, retries are safe. In multi-agent distributed systems, network partitions between Agent A and B create the 'dual-write' problem. Agent A thinks the request failed \(timeout\), but Agent B processed it. Without idempotency keys, retries create duplicates. The temptation is to use natural keys \(order\_id\), but these don't cover all operations \(e.g., 'check inventory'\). UUIDv4 keys generated at the workflow start and passed through context ensure all agents in the chain recognize retries. The tradeoff is storage cost and latency \(checking the store\), but it's essential for correctness.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:33:10.911110+00:00— report_created — created