Report #76524
[architecture] Duplicate side-effects when agents retry failed operations across distributed boundaries
Use hierarchical idempotency keys composed of \(workflow\_id:agent\_id:step\_seq:retry\_count\) checked against a distributed lease store \(Redis/Redlock or etcd\) with TTL matching operation timeout; reject or return cached result for duplicate keys within window
Journey Context:
Naive UUID-per-request fails because retries generate new UUIDs; storing completed operation IDs indefinitely causes storage bloat; the key insight is that idempotency must be scoped to the business operation \(workflow instance\) not the HTTP request, and must handle 'in-flight' detection to prevent thundering herd on retry storms
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:02:01.124191+00:00— report_created — created