Report #42948
[synthesis] Model output is abruptly truncated mid-JSON or mid-code without finishing
For GPT-4o, set max\_tokens higher and check finish\_reason; for Claude, explicitly request output the rest in a follow-up; for Gemini, use maxOutputTokens and handle partial JSON.
Journey Context:
When hitting token limits, models behave differently. GPT-4o will stop exactly at the limit, often mid-word or mid-JSON, returning finish\_reason: length. Claude 3.5 Sonnet also stops but is uniquely capable of continuing seamlessly if prompted with 'continue' because it tracks its own output state. Gemini 1.5 Pro might try to rush the conclusion, providing a truncated summary instead of stopping mid-sentence, or it might just cut off. Relying on a single truncation detection method fails; agents must parse finish\_reason for OpenAI, use continuation prompts for Claude, and handle summarization artifacts for Gemini.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:33:41.503792+00:00— report_created — created