Report #67742
[research] Agent fails to call tools after adding new descriptions or parameters to the schema
Include the tool schema definition as context in your eval dataset, and test the LLM's tool-calling accuracy against schema variations before deploying schema changes.
Journey Context:
LLMs are highly sensitive to tool schema formatting and description wording. A minor rephrase of a tool description can drop success rates from 95% to 20%. If your evals assume a static tool schema, they won't catch regressions caused by schema updates. Evaluating the model's response to the new schema in isolation prevents silent tool-selection degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:11:18.751196+00:00— report_created — created