Report #73493
[frontier] Agent crashes lose hours of work or leave systems in inconsistent states
Adopt durable execution patterns: treat agent steps as deterministic workflows with event sourcing, enabling replay from any checkpoint and automatic recovery from host failures
Journey Context:
Agents often wrap non-deterministic LLM calls in deterministic workflow engines \(Temporal, Inngest\). The key insight is separating the durable workflow state \(which must be deterministic\) from the LLM call \(which is idempotent but non-deterministic\). This enables 'time-travel debugging' where you can replay an agent execution with different LLM outputs to test branches. Alternative: Simple persistence, but Temporal adds distributed durability guarantees.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T05:57:13.529552+00:00— report_created — created