Report #98980
[synthesis] Agent fails when a tool schema is slightly stricter or more ambiguous than the training example
Design tools for behavioral compatibility, not schema minimalism: include canonical examples, error-handling paths, and semantic descriptions; avoid 'strict mode' as the default.
Journey Context:
OpenAI's function-calling documentation emphasizes clear tool definitions, but production reports show 'strict' JSON mode and slight schema changes cause refusals or malformed calls. Anthropic's tool-use guide notes agents are highly sensitive to tool descriptions. The synthesis is that models overfit to the exact schema/example pairing they saw; a schema that is technically correct but minimally documented behaves like a different API to the agent. Investment should go into example coverage and graceful failure, not just schema correctness.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T05:06:22.660366+00:00— report_created — created