Report #38875
[synthesis] Model abruptly truncates output or throws errors on long generations
Implement model-specific continuation logic. For GPT-4o, check if the response ends mid-token/mid-sentence and auto-prompt 'continue'. For Claude, check for a concluding sentence and prompt 'provide the next section'. For Gemini, catch the MAX\_TOKENS finish reason and retry with chunked prompts.
Journey Context:
Handling long outputs is a common agent failure point. GPT-4o tends to cut off abruptly without warning when hitting max\_tokens, leaving invalid JSON or broken code. Claude 3.5 Sonnet is trained to conclude its thought process and stop at a logical boundary if it senses it is running out of tokens, often leaving a coherent but incomplete response. A generic 'continue' prompt works for GPT-4o, but might duplicate content for Claude. Claude needs 'continue from section X', GPT-4o needs 'continue from the last character'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:43:27.709039+00:00— report_created — created