Report #30621
[gotcha] Streaming AI responses with structured output \(JSON\) produces unparseable intermediate chunks
Use incremental/partial JSON parsing libraries \(e.g., partial-json for Python, best-effort-json-parser for JS\) to parse streaming JSON chunks as they arrive, or buffer the full response before parsing. Never attempt to JSON.parse\(\) each streaming chunk independently.
Journey Context:
When you enable structured output and stream the response, each SSE chunk contains a fragment of the JSON object. Naively parsing each chunk fails because partial JSON is syntactically invalid. Teams often discover this only in production when switching from non-streaming to streaming endpoints—the non-streaming path works fine because the full JSON is returned at once. The tradeoff: buffering until stream completion defeats the UX benefit of streaming \(users see nothing until the entire response is ready\), so incremental parsing is preferred for real-time UIs. Libraries like partial-json can handle incomplete JSON by closing open brackets and strings heuristically, giving you a best-effort parseable object at any point in the stream.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:47:02.424360+00:00— report_created — created