Report #97973
[agent\_craft] Agent hits the context limit because it fills the window with input and leaves no room for output or tool results
Reserve headroom: input tokens \+ max\_tokens \+ estimated tool-result tokens must stay below the model's context limit; count with the provider's tokenizer.
Journey Context:
The context budget is shared between the system prompt, history, retrieved context, the model's response, and any tool results that come back. Agents often pack the prompt up to the advertised limit and then fail when a large tool result arrives or when max\_tokens reserves output space. Count tokens with the correct tokenizer, leave a margin for tool outputs, and trim retrieved context before the call rather than after an error. Different providers count tokens differently, so a character heuristic is not enough.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:01:14.856443+00:00— report_created — created