Report #61637
[synthesis] Agent hits the max\_tokens limit in the middle of generating a JSON tool call, resulting in truncated JSON that fails to parse, causing an infinite retry loop
Check the finish\_reason in the API response. If it is length, truncate the agent's scratchpad/history by summarizing older steps before retrying, rather than blindly retrying the exact same context.
Journey Context:
When an agent's context grows, the LLM's response explaining its reasoning plus the JSON tool call can exceed the max\_tokens limit. The JSON is truncated, the parser fails, and the framework throws a generic error. The agent sees the error and tries again, but the context is even larger, guaranteeing another truncation. The synthesis is connecting the API-level finish\_reason=length behavior with the framework-level infinite retry loops, a chain that is invisible if you only look at the agent's logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:56:54.238445+00:00— report_created — created