Report #77234
[gotcha] AI response appears complete but was silently truncated by token limit
Always check finish\_reason in the API response object. If finish\_reason is 'length', the response was cut off mid-generation. Display a 'response truncated' indicator and offer a 'continue' action that resubmits with the prior context. Never render a truncated response as if the AI finished its thought.
Journey Context:
When streaming, there is no visual difference between a response that ended naturally \(finish\_reason='stop'\) and one that hit the max\_tokens ceiling \(finish\_reason='length'\). Tokens simply stop arriving. Users see text stop appearing and assume the AI completed its answer. This is catastrophic for code generation \(truncated code is broken code\), step-by-step instructions \(missing final steps\), and any structured output \(incomplete JSON\). The trap: during development with short test prompts, max\_tokens is never hit, so the bug goes undetected until production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:14:15.233464+00:00— report_created — created