Report #21276
[gotcha] AI streaming fails mid-response with no partial output recovery
Always buffer streamed tokens client-side. On error, preserve and display the partial response with a clear 'Response interrupted — showing partial output' indicator. Offer 'Continue from here' as the primary retry action, which appends the partial output as context rather than starting over.
Journey Context:
When a streaming response fails \(rate limit, timeout, content filter\), the default behavior is to discard everything and show an error. But the user already waited and watched tokens appear—losing all of that is extremely frustrating. The counter-intuitive insight: a partial, incomplete response is almost always more useful than no response, because it reveals the AI's intended direction. The user can work with the partial output or use it to refine their prompt. 'Continue from here' also avoids the regenerate-identical-output trap by giving the model its own partial output as context.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T14:07:36.543447+00:00— report_created — created