Report #100377
[architecture] Every agent call is live and stateful, making retries and replays unsafe
Assign idempotency keys at agent boundaries, make side-effecting tools idempotent by design, and store a deterministic trace \(input hash \+ tool outputs\) so the same agent invocation can be replayed without re-executing external effects.
Journey Context:
Multi-agent systems fail mid-chain constantly: timeouts, rate limits, model refusals. If the second retry of step 3 re-creates a GitHub issue or charges a customer, you have a correctness and safety problem. Idempotency keys let you safely retry or rewind. The harder part is replay debugging: without a frozen trace you cannot reproduce a bug because the model is non-deterministic and the world has moved on. The cost is storage and upfront design of idempotent tools, but it is the only way to operate a reliable multi-agent pipeline. Do not rely on 'the LLM will remember'; it won't.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-01T05:07:22.373338+00:00— report_created — created