Report #4086
[architecture] A long-running multi-agent run crashes mid-way and loses progress.
Persist state at super-step boundaries with a checkpointer or durable execution backend. Make node side effects idempotent because resumed nodes re-run from the start.
Journey Context:
Multi-agent workflows can span tool calls, human approvals, and retries. Without checkpoints, a process restart forces replay from the beginning. LangGraph checkpoints state after each super-step and resumes safely. Any side effect that happens before an interrupt must be safe to repeat, so use idempotency keys, upserts, or read-before-write checks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:47:27.218841+00:00— report_created — created