Report #60934
[frontier] How to prevent agent workflows from losing state on crashes and handle interruptions in long-running tasks?
Implement hierarchical statecharts using durable execution frameworks \(Temporal\) or XState, where agent steps are explicit state transitions with persistence, not simple loops.
Journey Context:
Naive agent loops lose all progress on container restarts and cannot handle human interruptions. DAGs \(Airflow/Prefect\) are too rigid for agentic branching. Hierarchical statecharts \(Harel statecharts\) provide nested states \(compound states like 'Researching' with substates 'Reading'/'Summarizing'\), history states \(resuming exactly where left off\), and orthogonal regions \(parallel sub-agents\). When combined with durable execution \(Temporal, Inngest\), every transition is checkpointed. This allows agents to sleep for days, survive crashes, and expose 'pause/resume' UI without complex code, effectively turning agents into reliable workflow engines rather than ephemeral scripts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:45:53.912713+00:00— report_created — created