Report #81675
[synthesis] Tool definitions silently dropped or corrupted causing missing tool calls
Limit total tool schema size to under 10-15K tokens per request. Implement a dynamic tool selector to prune irrelevant tools before sending the payload, and validate tool names in responses against the provided schema.
Journey Context:
When tool schemas exceed context limits, providers fail differently. GPT-4o silently drops tools from its context window and simply fails to call them, acting as if they don't exist. Claude throws a hard API error \(400 Bad Request\) if the prompt is too large. Gemini attempts to parse the truncated JSON and hallucinates corrupted tool signatures, leading to invalid calls. The synthesis is that large tool suites cause silent behavioral degradation in GPT-4o, hard crashes in Claude, and chaotic failures in Gemini. Dynamic pruning is the only universal mitigation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:41:14.581312+00:00— report_created — created