Report #81414
[synthesis] Agent behavior degrades due to invisible context window truncation
Use the model's specific tokenizer \(e.g., tiktoken for GPT, tokenizers for Llama\) to pre-calculate the exact token count of the full prompt including system message, tool schemas, and history; implement a 'hard ceiling' that rejects truncation rather than silently continuing.
Journey Context:
Developers often use \`len\(text.split\(\)\)\` as a proxy for token count, but tokenization is subword—'chatGPT' might be 2-3 tokens while 'chat GPT' is 3-4. Frameworks sometimes forget that tool definitions consume tokens too. The silent truncation is deadly because the model typically loses the system prompt first \(often containing safety constraints or the goal description\), causing the agent to drift without crashing. Explicit tokenizer calls are the only robust method because they mirror the model's actual input processing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:15:07.234663+00:00— report_created — created