Report #79006
[gotcha] AI response appears complete but was silently truncated at max\_tokens
Always check finish\_reason in the API response. If finish\_reason is 'length', the response was cut off — surface a 'continue generating' affordance or auto-continue with a follow-up prompt. Never render a truncated response as if it were complete.
Journey Context:
The most insidious UX failure in AI products: LLMs generate fluent text right up to the cutoff, so a truncated response looks coherent and complete. Users read a confident, well-formed partial answer and assume it is the full answer — there is no visual signal that anything is missing. This is strictly worse than an error because the user has no idea something went wrong. The finish\_reason field exists precisely to signal this, but most UI implementations never check it. Users then make decisions on incomplete analysis, half-written code, or partial instructions. The fix is not just checking the field — it is designing the UX so truncated responses are clearly marked and continuation is frictionless.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:12:14.145973+00:00— report_created — created