Report #57369
[research] Agent breaks after minor updates to tool descriptions or schemas
Version your tool schemas and include tool-selection accuracy as a first-class metric in your regression suite. Diff the tool call JSON against the expected schema in the golden trace.
Journey Context:
LLMs are highly sensitive to subtle changes in tool descriptions or parameter names. A refactor that renames a parameter from 'path' to 'file\_path' will cause silent failures if the agent isn't re-evaluated. Schema regression testing catches these breakages before deployment, treating the tool interface as a strict API contract.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:46:55.393346+00:00— report_created — created