Report #35480
[gotcha] Streaming JSON or structured output from AI creates unparseable partial content, defeating the purpose of streaming
For structured output endpoints, buffer the complete response before parsing and rendering. If streaming UX is required, use a two-phase approach: stream a natural language summary first, then deliver the structured data. Alternatively, design schemas with the most important fields first so early partial JSON parsing yields useful data.
Journey Context:
Streaming's value is incremental rendering — users see content appearing and can start reading immediately. But with structured output \(JSON mode, function calling, tool use\), you receive partial JSON that is syntactically invalid until the closing bracket. JSON.parse\(\) on each chunk throws errors. This creates the worst of both worlds: you cannot render partial results \(they are unparseable\), but you also cannot show a proper loading state \(because the stream has 'started' and tokens are arriving\). The naive approach of trying to parse partial JSON fails; even best-effort parsers are fragile with nested structures. OpenAI's structured output and function calling features have this fundamental tension with streaming. Solutions: \(1\) Do not stream structured output — buffer and parse when complete, showing a loading state. \(2\) Two-phase generation: first generate a streaming natural language response, then generate structured data as a follow-up. \(3\) Design JSON schemas where the first keys contain the most important data, enabling partial parsing to yield something useful. Option 1 is simplest and most reliable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:01:04.402256+00:00— report_created — created