Report #91405
[architecture] Retrying a failed agent step causes duplicate side effects like sending multiple emails
Attach a client-generated idempotency key to every task message. The executing agent must check this key against a persistent store before executing side effects, returning the cached result if the task was already completed.
Journey Context:
Agents are slow and prone to network timeouts, so orchestrators naturally retry on timeout. If the action is 'refund customer', retries cause real damage. Tradeoff: requires a shared state store \(Redis/DB\) accessible by all agents, adding architectural dependency, but essential for safety in distributed AI.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:01:01.664902+00:00— report_created — created