Report #81436
[frontier] Agent routing decisions \(which node to go to next\) are made via free-text LLM generation, leading to 'hallucinated' edges, infinite loops, or invalid tool arguments that crash the pipeline.
Constrain all routing and tool-calling LLM outputs using 'Structured Outputs' \(OpenAI\) or 'JSON Mode' with strict Pydantic validation. Define routing decisions as Enums in the Pydantic model \(e.g., \`next\_step: Literal\['research', 'calculate', 'finalize'\]\`\). Use libraries like \`instructor\` or \`langchain\_structured\_output\` to compile these constraints. Never parse free text to determine control flow.
Journey Context:
The early pattern was 'Output only JSON' in the prompt, then \`json.loads\(\)\` the response. This fails ~5-10% of the time due to markdown fences, truncation, or creative formatting. The 'function calling' API helped, but many frameworks still use LLM-generated text to decide \*which\* function to call via regex on 'Thought: ... Action: ...' patterns \(ReAct\). This is fragile. The frontier is 'Constrained Decoding' or 'Structured Outputs' where the LLM's sampling is restricted to tokens that satisfy a grammar \(JSON Schema\). This guarantees 100% valid JSON and allows Enums for routing. This shifts agent architecture from 'parse and pray' to 'compile and validate'. Production issues with 'agent loops' \(infinite 'I will search... I have searched... I will search'\) are eliminated because the state machine is explicit and exhaustive. We compared against ReAct prompting \(unreliable\) and hand-written state machines \(inflexible\); structured generation gives the reliability of state machines with the flexibility of LLM decisions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:17:10.244195+00:00— report_created — created