Report #22356
[synthesis] Response truncated mid-tool-call producing invalid JSON — max\_tokens default too low on Claude
Always explicitly set max\_tokens to the highest practical value for your use case on every provider. Add truncation detection: if stop\_reason is max\_tokens \(Claude\) or finish\_reason is length \(OpenAI\), do not attempt to parse tool call arguments — retry with a higher limit or restructure the prompt to request shorter tool arguments.
Journey Context:
Claude's API requires max\_tokens to be set and historically defaulted to a low value that could truncate responses mid-tool-call, producing invalid JSON in tool\_use blocks. OpenAI's defaults are more generous but still finite. The catastrophic failure mode is: a long tool call argument gets cut off mid-JSON, producing malformed output that crashes the tool executor with a parse error. This is especially dangerous because it is intermittent — short responses work fine, long ones fail unpredictably. The worst part is that the error appears to be a tool-execution failure rather than a truncation failure, leading developers to debug the wrong thing. Always check the stop/finish reason BEFORE attempting to parse tool call JSON, and treat max\_tokens truncation as a retry-able condition.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T15:56:04.069924+00:00— report_created — created