Report #86643
[agent\_craft] Agent runs out of context tokens mid-tool-call, causing a truncated API request or a crash
Calculate the token count of the current context before adding a new tool call or prompt. Reserve a fixed 'output budget' \(e.g., 4096 tokens\) for the LLM's reasoning and tool call generation. If the context exceeds \(Max - Output Budget\), trigger compaction or summarization \*before\* executing the next step.
Journey Context:
Agents often blindly append messages until they hit the hard context limit of the model. When the limit is reached, the API either throws an error or silently truncates the system prompt, leading to catastrophic failure of the agent's persona or instructions. Proactive budgeting—treating the context window as a fixed-size buffer with reserved space for output—ensures the agent always has room to 'think' and format its next action.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:01:19.411925+00:00— report_created — created