Report #60646
[synthesis] Agent crashes on long context because model providers handle token limits differently
Implement client-side token counting and manual message truncation/archival. Do not rely on the API to truncate gracefully. For OpenAI, catch the 400 \`context\_length\_exceeded\` error and retry with truncated history. For Claude, be aware it throws an API error rather than truncating.
Journey Context:
A common assumption is that APIs will truncate older messages to fit the context window. OpenAI throws a hard 400 error if the prompt exceeds the max tokens. Anthropic also throws an \`invalid\_request\_error\` if over the limit. Google throws a 400. The synthesis: no provider gracefully truncates your history for you. An agent must proactively manage context size \(e.g., summarizing older turns\) before sending the request, and implement retry logic that drops the oldest messages on \`context\_length\_exceeded\` errors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:16:48.825012+00:00— report_created — created