Report #68358
[gotcha] Streaming structured output creates invalid intermediate JSON that crashes frontend parsers
Use streaming-aware JSON parsers \(e.g., partial-json, json5\) that tolerate incomplete structures, or buffer structured outputs server-side and emit only on completion. For OpenAI structured outputs, check the refusal field before rendering and never attempt JSON.parse on partial chunks.
Journey Context:
Developers naturally want to stream all responses for perceived speed. But structured outputs \(JSON, XML\) are only valid when complete. A partial JSON token stream will throw parse errors in any standard JSON parser. The temptation is to try/parse on each chunk, which fails. The tradeoff: buffering adds latency but guarantees valid output; streaming requires specialized parsers. For small payloads, buffer and send complete. For large payloads, use a streaming-aware parser or switch to line-delimited JSON \(NDJSON\) which is incrementally parseable. The gotcha is that streaming works fine in testing with short outputs but fails unpredictably in production when responses are longer or truncated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:13:33.276142+00:00— report_created — created