Report #61794
[synthesis] Model truncates JSON tool calls mid-stream due to max\_tokens limits
Set max\_tokens to at least 4096 for tool calls across all providers; check for length stop reason; implement a 'continue' prompt for GPT-4o and Claude to resume generation.
Journey Context:
When models hit the default max\_tokens limit \(often 1024 or 2048\), they truncate the JSON mid-stream. GPT-4o returns a finish\_reason of length. Claude returns max\_tokens. Gemini returns MAX\_TOKENS. However, their recovery behavior differs. GPT-4o can often be prompted with 'continue' to finish the JSON. Claude sometimes gets confused and starts a new thought. The right call is to preemptively set max\_tokens high \(e.g., 4096\) for any tool call, and if truncation occurs, use a model-specific recovery prompt: 'Continue the previous JSON exactly where you left off' for GPT-4o, or re-request the tool call for Claude.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:12:43.142855+00:00— report_created — created