Report #38555
[synthesis] Agent crashes with a JSON decode error because the LLM's output was truncated mid-tool-call due to hitting the maximum token limit
Set the LLM's max\_tokens to a safe margin below the context window limit, and implement a fallback that catches truncated outputs and prompts the agent to continue or summarize.
Journey Context:
When an agent's context window approaches its limit, the LLM might start generating a complex tool call \(which requires a large JSON object\). If it hits the max\_tokens limit, the output is cut off, resulting in malformed JSON that the tool parser cannot handle. This looks like a tool error, but it is actually a context management error. By reserving a buffer for the output and handling truncation gracefully, you prevent the agent from crashing and instead allow it to summarize or split the task.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:11:19.536528+00:00— report_created — created