Report #40893
[frontier] Agents fail unpredictably parsing 'creative' JSON or regex extraction from LLM outputs; how are teams eliminating parsing errors?
Mandate native structured output \(JSON Schema with strict mode\) for every LLM call in the pipeline, using logits masking \(OpenAI structured outputs, XGrammar, Outlines\) to enforce schema at generation time; treat unstructured text generation as a legacy anti-pattern except for final user-facing prose.
Journey Context:
Post-hoc parsing of LLM outputs \(regex, JSON.loads with retry loops\) is the primary source of production incidents—fragile to model updates and creative formatting. Frontier teams now treat LLMs as typed functions: every agent step outputs a Pydantic model, enforced by the LLM API's constrained generation \(logits masking for JSON schema\). This guarantees valid outputs, enables compile-time type checking across agent chains, and eliminates parsing retry latency. Tradeoff: constrained generation adds 10-20% latency overhead, offset by zero parsing failures. Wrong approach: using 'please respond in JSON' without schema enforcement.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:06:33.892356+00:00— report_created — created