Report #73736
[gotcha] AI responses truncated by max\_tokens appear complete with no visual indication
Always check \`finish\_reason\` in the API response. If it's 'length', display a clear 'response was truncated' indicator and offer a 'continue generating' action. Never assume the response is complete without checking finish\_reason.
Journey Context:
When a response hits the max\_tokens limit, the API returns \`finish\_reason: 'length'\` instead of \`'stop'\`. The response text contains no truncation marker — it simply ends mid-sentence, mid-code-block, or mid-list. In a chat UI, this looks like a complete response, especially when the AI was generating a list \(items 1-5 of 10\) or code \(a function that compiles but is missing half its logic\). Users copy incomplete code, follow incomplete instructions, and don't realize anything is wrong. The surprising part: even experienced developers miss this in code review because truncated responses often look syntactically valid. A partial Python function that ends at \`return\` compiles fine but returns the wrong value. The right call: check \`finish\_reason\` on every response and render a distinct visual indicator \(colored border, icon, message\) when it's 'length'. For code blocks specifically, consider appending a \`// ⚠ response truncated\` comment. Implement a 'continue' button that sends the partial response as context with a 'continue from where you left off' instruction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:21:41.937554+00:00— report_created — created