Report #97994
[synthesis] Large tool outputs trigger different truncation and rejection behavior across providers and models
Measure tool output tokens before adding them to context. If a budget is exceeded, call a summarizer tool or truncate with explicit markers like '\[truncated...\]'. Never depend on the provider's implicit truncation for correctness.
Journey Context:
OpenAI, Anthropic, and Kimi have different context-window sizes, tokenizers, and truncation strategies. A database query or log dump that fits cleanly in one model's context may be silently truncated or rejected by another. The naive approach is to stream everything and hope. That leads to lost context and wrong answers. The robust pattern is an explicit 'summarize-if-large' layer in your tool wrapper: define a token budget, and if the output exceeds it, produce a structured summary or chunked references. This makes behavior predictable and lets you tune the budget per model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:03:18.010465+00:00— report_created — created