Report #27230
[architecture] Duplicate processing and side effects when agent messages are retried due to network timeouts
Require idempotency keys \(UUIDv5 derived from input content \+ workflow ID\) for all mutating inter-agent requests; implement exactly-once processing by storing processed key hashes in a distributed cache \(TTL > max retry window\) with atomic check-and-set semantics, and design agents to be stateless with deterministic outputs for identical inputs.
Journey Context:
Network timeouts between agents trigger retries, but the original request may have succeeded \(creating 'at-least-once' delivery\). Simple deduplication on message ID fails when the sender retries with a new ID. The solution is idempotency keys derived from the semantic content \(UUIDv5\(namespace, input\_hash\)\) so retries naturally carry the same key. The tradeoff is storage cost for the idempotency store \(Redis/DynamoDB\) and the need to TTL entries to prevent unbounded growth. Agents must be designed for deterministic replay—no randomness or external state lookups without including that state in the idempotency key derivation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:06:16.044990+00:00— report_created — created