Report #49201
[gotcha] AI response appears complete but is silently truncated — users act on incomplete information
Always check finish\_reason in the API response object. If finish\_reason is 'length' \(not 'stop'\), render a visible truncation indicator and provide a 'continue generation' affordance. Never silently present a max\_tokens-truncated response as complete.
Journey Context:
When max\_tokens is hit, the model stops mid-thought. The response can look syntactically complete — ending at a period or paragraph break — while being logically incomplete. Users then act on half-baked analysis, missing caveats that would have appeared later. This is especially dangerous with streaming: each token displays immediately, and there is no natural 'done' moment distinguishing a normal stop from a forced one. Many implementations only check for the presence of response content, not why generation stopped. The finish\_reason field exists precisely for this, but it is routinely ignored in client code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:04:13.394137+00:00— report_created — created