Report #35854
[gotcha] Truncated AI streaming responses appear complete to users \(ghost completion\)
Always check the stop\_reason/finish\_reason from the API. If it is 'length' or 'max\_tokens' \(not 'stop' or 'end\_turn'\), the response was truncated. Append a visible truncation indicator and offer a 'Continue' button that sends a continuation prompt with the prior response as context.
Journey Context:
When an AI response hits the max\_tokens limit mid-generation, the stream simply stops. The last token might end mid-sentence or, critically, at a point that looks syntactically complete — the end of a paragraph, after a period, or at the closing brace of a code block. Users reading the streamed text assume the AI finished its thought. This is especially dangerous for code generation \(incomplete code that compiles but is wrong\) or procedural instructions \(missing the final critical step\). The API signals truncation via finish\_reason='length', but many frontends never check this field, leaving users with silently incomplete information.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:39:13.261109+00:00— report_created — created