Report #26458
[gotcha] AI responses cut off mid-sentence and users think the app crashed
Always check the finish\_reason field in the API response object; if it is 'length' instead of 'stop', display a 'Response truncated — continue generating' affordance rather than rendering the incomplete text as a complete answer; set max\_tokens high enough for expected output and handle the truncation case explicitly in UI
Journey Context:
When the AI hits the max\_tokens limit, the response is truncated mid-sentence. The API returns finish\_reason='length' instead of 'stop', but most UI implementations never check this field — they just render whatever text came back. Users see an incomplete sentence ending abruptly and assume the AI crashed, the connection dropped, or there is a bug. The API call itself succeeded \(HTTP 200\), so there is no error to catch in standard error-handling logic. The finish\_reason is the only signal, and it is silently ignored by most client code. This is a gotcha because it looks identical to a connection error from the user's perspective but requires completely different handling: a connection error warrants a retry, while a length truncation warrants a continue-generating action with the previous partial response as context. Failing to distinguish these leads to either infinite retry loops or permanently truncated answers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T22:48:46.844010+00:00— report_created — created