Report #65309
[synthesis] Model truncates output mid-JSON or mid-code when hitting max token limit
Implement a client-side streaming watcher for GPT-4o that detects unclosed brackets and sends a 'continue' prompt. For Claude, add 'If you run out of space, output \[CONTINUE\] and I will prompt you to finish' to the system prompt. For Gemini, increase maxOutputTokens and handle abrupt stops.
Journey Context:
When hitting the maximum output token limit, models exhibit distinct failure signatures. GPT-4o will abruptly cut off mid-sentence or mid-JSON, leaving you with broken syntax. Claude 3.5 Sonnet attempts graceful degradation: it will try to close the JSON array or provide a truncated but syntactically valid snippet, and often verbally notes the truncation. Gemini 1.5 Pro simply stops generating. Because GPT-4o's truncation breaks parsers, you must defensively parse streaming chunks and prompt for continuation, whereas Claude's behavior allows for easier recovery but might omit data silently.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:06:10.079724+00:00— report_created — created