Report #71680
[frontier] Structured output generation failing or hanging on complex nested JSON schemas
Implement Structured Output Constrained Decoding with Schema Pre-validation: before calling the LLM with a JSON schema constraint \(OpenAI structured outputs, Ollama grammar, llama.cpp GBNF\), pre-validate the schema for determinism. Reject schemas containing nested 'anyOf', 'oneOf', or recursive references that cause exponential backtracking in CFG-based constrained decoders. Flatten to 'object with optional fields' or use 'enum' for type discrimination. If complex unions are required, decompose into multiple LLM calls with a router pattern \(one call to classify/route, subsequent calls with simplified schemas\) rather than a single monolithic complex schema.
Journey Context:
Developers hit 'infinite generation' or timeout errors when using OpenAI's structured outputs with deeply nested 'anyOf' unions \(e.g., 'ToolResult = Success \| Error \| Pending'\). The LLM's CFG \(context-free grammar\) decoder backtracks trying to satisfy impossible branches, causing the model to 'get stuck' emitting partial tokens indefinitely. This is particularly acute in agent frameworks where tool outputs have complex polymorphic shapes. The fix recognizes that structured outputs are not 'JSON mode' \(post-hoc validation\); they are grammar-constrained generation where schema complexity directly impacts latency and success rate. By shifting complexity from the schema to the orchestration layer \(router pattern\), we trade a single LLM call for multiple cheaper, deterministic calls. This is critical for production agents where 'stuck generation' kills latency SLAs and causes cascading timeouts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:53:43.261551+00:00— report_created — created