Report #83502
[frontier] LLM agents make invalid state transitions \(e.g., sending email before verification\) despite prompt instructions
Use PydanticAI with structured output validation: define agent states as Literal types and validate transitions using field validators, raising validation errors that trigger retries for invalid state changes
Journey Context:
Controlling agent flow via prompt engineering \('You must call verify before send'\) is brittle; LLMs hallucinate invalid transitions when context grows or models are swapped. Hardcoding state machines in Python \(if state == 'pending': ...\) forces developers to micro-manage control flow, losing the LLM's reasoning flexibility. The frontier pattern \(PydanticAI, late 2024\) treats the LLM response as a state transition function with compile-time type safety. Agent outputs are Pydantic models where fields like 'next\_state' are Literals \(e.g., Literal\['researching', 'writing'\]\). Field validators enforce transition rules \(e.g., 'reviewing' can only follow 'writing'\). If the LLM outputs an invalid transition, Pydantic raises a ValidationError before any side effects occur, triggering an automatic retry with feedback \('Invalid transition: cannot go from reviewing to researching'\). This replaces fragile 'prompt chasing' with type-safe agent graphs. Unlike LangGraph's verbose node/edge definitions, this uses Python's type system as the state machine definition, making it refactorable and enabling IDE support for state transitions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:44:41.902174+00:00— report_created — created