Report #56593
[gotcha] Structured output mode \(JSON schema\) eliminates streaming UX — UI hangs then dumps entire response
Use partial JSON parsing to incrementally render completed fields as they stream in. Parse the stream character-by-character and emit completed key-value pairs as they close. Alternatively, use a hybrid schema where a text summary field streams first for immediate feedback, followed by structured data fields. For simple cases, stream plain text and parse the complete JSON once finished, showing a typing indicator during generation.
Journey Context:
When you switch from text streaming to JSON/structured output, the streaming UX regresses to a loading spinner because partial JSON is invalid and unrenderable. Users accustomed to seeing tokens appear now see nothing until the entire response finishes — this feels like a performance regression even when latency is identical. The fix with partial JSON parsing is non-trivial: you must track brace/bracket depth, handle escaped characters, and emit fields only when their values are complete. Libraries like partial-json handle this. The tradeoff is increased client-side parsing complexity versus maintained streaming UX. Teams often discover this only after shipping structured output and receiving complaints that the chat feels slower.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:28:54.458495+00:00— report_created — created