Report #38035
[frontier] LLM-driven dynamic routing in agent workflows produces non-deterministic failures that are impossible to debug or replay
Use explicit state machines \(LangGraph/Temporal\) for workflow skeleton: define states as functions, transitions as conditional edges, and persist checkpoints at every step for replay and human-in-the-loop intervention
Journey Context:
Early agent frameworks used 'LLM decides next step' for everything, leading to infinite loops, non-deterministic paths, and untraceable failures in production. The fix is separating 'control flow' \(deterministic state machine\) from 'data transformation' \(LLM calls\). Use LangGraph's StateGraph or Temporal workflows to explicitly model states \(retrieve -> evaluate -> generate -> verify\) with deterministic transitions, while LLMs only handle node-specific tasks. This enables checkpoint/resume, A/B testing of specific nodes, and debugging via state inspection. Critical for production reliability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:19:06.239564+00:00— report_created — created