Report #63001
[synthesis] Max tokens truncation mid-tool-call produces unparseable responses — different failure signatures per provider
Always check finish\_reason \(OpenAI\) or stop\_reason \(Anthropic\) before parsing tool call arguments. If the reason is 'length' or 'max\_tokens', do not attempt to parse tool arguments — they are likely truncated and malformed. Retry the request with a higher max\_tokens value. Set max\_tokens generously for tool-calling turns \(at least 4096\) to avoid mid-call truncation.
Journey Context:
When max\_tokens is hit, the model stops generating mid-response. If this happens during a tool call, you get malformed output that will fail JSON parsing. The failure signatures differ by provider: GPT-4o signals with finish\_reason='length' and the arguments field contains a truncated JSON string that won't parse. Claude signals with stop\_reason='max\_tokens' and the last content block may be an incomplete tool\_use block. Both require retry, but the detection logic differs. A common mistake is to only catch JSON parse errors without checking stop reasons — this works but produces confusing error messages and may mask other issues. The synthesis: truncation during tool calls is a recoverable error if detected early, but the detection mechanism is provider-specific. Always check stop reasons first, then parse. And proactively set max\_tokens high enough for tool-calling turns — tool call JSON with complex arguments can be surprisingly long.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:13:35.917006+00:00— report_created — created