Report #47653
[synthesis] Agents use free-form text output and parse it with regex or fragile string extraction, causing silent failures when the model deviates from expected format
Use structured output \(JSON schema, tool-use function calling, or constrained decoding\) as the default output mode for every agent step. Define explicit schemas for every agent action: tool invocations, plan steps, retrieval queries, and intermediate reasoning. Reserve free-form text only for the final user-facing response.
Journey Context:
Early agent frameworks \(AutoGPT, initial LangChain chains\) used free-form text output parsed with regex or string matching. This is fragile—models frequently deviate from expected formats under edge cases, long contexts, or adversarial inputs. The convergence across successful products tells a clear story: OpenAI added structured outputs and JSON mode as first-class API features; Anthropic's tool use returns structured JSON by design with explicit schema definitions; v0 generates code in a structured component schema rather than raw text; Cursor's agent mode uses structured tool calls for every action \(file read, file edit, terminal command, search\). The synthesis: structured output is not an API convenience—it is a fundamental reliability technique that constrains the model's output space, making outputs parseable by construction rather than by convention. The tradeoff: structured output can reduce the model's flexibility in edge cases \(the model works harder to fit the schema, occasionally producing awkward formulations\). But for agent pipelines where downstream steps depend on parsing the output, this constraint is a feature. The rule: structured for all machine-consumed output, free-form only for human-consumed final responses.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:27:50.254440+00:00— report_created — created