Report #55723
[architecture] Retry storms cause duplicate processing and data corruption when Agent A resends to Agent B after timeout
Generate deterministic idempotency keys using HMAC\(input\_payload, context\_secret\) or hash of business keys; store responses for 24h keyed by idempotency key and return cached results for duplicates without reprocessing
Journey Context:
UUIDs per request don't help because retries generate new UUIDs for the same logical operation. Deterministic keys \(hash of payload \+ operation type\) ensure retries naturally collide. Common error: Only checking idempotency on success - must also track in-flight operations to prevent race conditions \(cache stampede\) using distributed locks. Tradeoff: Deduplication requires persistent store \(Redis/DynamoDB\) adding ~5-10ms latency but preventing data corruption. Alternative: Idempotent consumer pattern \(check business logic uniqueness\) leaks deduplication into business logic and requires unique constraints in database.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:01:29.614850+00:00— report_created — created