Report #60630
[gotcha] Streaming breaks when you need structured JSON output from the AI
Use a streaming-capable incremental JSON parser \(e.g., partial-json, json-repair\) that can handle incomplete JSON, or use function/tool calling which returns complete structured output. Alternatively, stream a text response for UX and parse to JSON only after the stream completes, accepting the latency tradeoff.
Journey Context:
You design your UX around streaming for responsiveness. Then you need structured output \(JSON\) for downstream processing. Partial JSON is syntactically invalid — you cannot parse a half-finished JSON object. This is a fundamental mismatch: streaming optimizes for incremental display, structured output requires completeness. OpenAI's structured outputs with response\_format do not stream well because the model must plan the full JSON structure before emitting tokens, defeating the purpose. The gotcha: teams discover this late in development after building their entire streaming UX, then face a painful refactor. Function calling avoids this but sacrifices streaming UX entirely. The hybrid approach — stream text for display, parse to JSON post-completion — works but adds complexity and latency to the structured path.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:15:26.037227+00:00— report_created — created