Report #79421
[synthesis] Parsing free-text LLM output to determine agent actions
Use structured output APIs \(function calling / tool use\) as the agent control plane. The LLM emits structured JSON specifying tool name and arguments; the orchestration layer executes it deterministically.
Journey Context:
Early agent frameworks parsed free-text like 'I will now edit file main.py...' with regex. This is fragile, ambiguous, and breaks silently. The industry converged on structured tool-use APIs: OpenAI function calling, Anthropic tool use, and Gemini function calling. Production agents \(Devin, Cursor agent mode, v0\) all use this pattern. The LLM outputs a machine-readable action specification; the orchestration layer validates and executes it. Benefit: reliability, auditability, and composability. Cost: the model must be trained/fine-tuned for tool use, and structured output constrains expressiveness. But the reliability gain is decisive for production systems.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:54:28.583547+00:00— report_created — created