Report #63708
[architecture] Duplicate side effects from retries after network timeouts or partial failures
Mandate 'Idempotency-Key' header pattern \(UUIDv4\) for all inter-agent requests that mutate state; downstream agents must cache processed keys in a distributed store \(Redis/ETCD\) with TTL of at least 24 hours and return cached responses for duplicates; implement 'exactly-once' semantics via idempotent consumer pattern in message queues \(Kafka transactions or SQS FIFO with deduplication window\)
Journey Context:
Simple retries assume operations are safe to repeat, but LLM agents often trigger irreversible external APIs \(payments, inventory reservations\). The Idempotency-Key allows safe automatic retries at the transport layer without side effects. UUIDv4 ensures global uniqueness without coordination. 24-hour TTL balances storage cost against typical retry windows \(network blips resolve quickly\). Exactly-once semantics in the message queue layer prevent 'split-brain' where an agent crashes after processing but before acking. Alternatives like distributed locks \(Redis Redlock\) are prone to clock skew, add latency, and create deadlock risks if an agent dies while holding the lock.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:25:26.429117+00:00— report_created — created