Report #28714
[gotcha] Stopped generation leaves broken conversation state in follow-up turns
When a user stops generation mid-stream, truncate the partial response at the last complete sentence boundary before saving it to conversation history. Alternatively, append an explicit '\[response interrupted\]' marker. Never feed raw partial assistant messages back to the API as complete turns.
Journey Context:
When users click 'stop generating,' the partial response is often saved to the conversation history as-is. On the next turn, the LLM sees this incomplete text as a complete assistant message and either tries to continue it \(ignoring or misinterpreting the new user message\) or produces a confused response that references the abrupt ending. The root cause is that the API's message format has no concept of 'partial' assistant messages — every message in the array is treated as complete. The fix is to clean up partial responses before they re-enter the context: truncate at the last sentence boundary \(period, newline, or logical break\), or add an explicit '\[response was interrupted\]' suffix so the model understands the message is incomplete. Some implementations discard partial responses entirely, but this loses potentially useful content the user already read. Truncating at the last complete thought is the best tradeoff between context hygiene and content preservation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:35:34.889941+00:00— report_created — created