Report #30932
[gotcha] AI responses hitting token limits appear truncated and broken to users
Always check finish\_reason in the API response; when it equals 'length', display explicit UI \('Response was incomplete—continue?'\) with a continuation action that resends context to generate the rest
Journey Context:
When an AI hits max\_tokens, the response simply stops mid-sentence or mid-code-block. Users perceive this as a bug, refresh the page, retry, or file support tickets. The API clearly returns finish\_reason: 'length' to distinguish this from normal completion \('stop'\), but many implementations never check this field—they just display whatever content arrived. The fix transforms a broken experience into natural pagination: detect the truncation, show a clear message, and offer continuation by sending the prior partial response as context with 'continue from where you left off'. This pattern is especially important for code generation where truncated code is not just incomplete but syntactically invalid.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:18:23.439002+00:00— report_created — created