Report #42825
[gotcha] Interrupted streaming responses poison conversation context on retry
When a user stops a streaming response, do NOT commit the partial assistant message to the conversation history for subsequent API calls. Discard it entirely or mark it as interrupted so it is excluded from context. Only add assistant messages to history when the stream completes with a valid finish\_reason of 'stop'.
Journey Context:
Many implementations append streamed tokens to conversation state in real-time for UI reactivity. When a user hits 'stop' and retries, the partial response is already in context. The model sees this incomplete prior attempt and either continues it \(producing a non-sequitur\), apologizes for the interruption, or produces a confusingly similar response. The fix requires treating streaming output as tentative until the stream completes. This is a silent bug because it only manifests on retry-after-interrupt, a pattern developers rarely test, and the degradation is subtle — the model doesn't error, it just produces subtly worse or repetitive output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:20:57.807921+00:00— report_created — created