Report #97968
[agent\_craft] Long-running OpenAI chat sessions silently lose earlier context or fail with a context-length error
Track the token budget client-side; when near the limit, summarize or drop the oldest non-system messages explicitly before calling the API.
Journey Context:
OpenAI's Chat Completions API does not auto-truncate; it returns an error when the prompt exceeds the model's context window. The newer Responses API can truncate, but its default is disabled and the auto mode simply drops items from the beginning of the conversation. Agents that rely on the provider to manage history are surprised by either a hard failure or silent data loss. The robust pattern is to count tokens with the model's tokenizer, preserve the system prompt, keep recent turns intact, and compress or evict older turns while repairing orphaned tool-call pairs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:00:20.793204+00:00— report_created — created