Report #59506
[synthesis] Agent generates subtly malformed JSON tool calls that pass validation but fail downstream
Implement strict semantic validation of tool call arguments, not just JSON schema validation. Track the distribution of optional parameters over time; a sudden drop in optional parameter usage indicates the model is 'forgetting' how to use the tool fully.
Journey Context:
JSON schema validation ensures a tool call has required fields and correct types. However, as models drift or context lengthens, they start omitting optional parameters or filling them with generic defaults \(e.g., setting limit=10 instead of the optimal limit=100\). The tool executes successfully, but the downstream quality degrades because the agent is no longer leveraging the full capability of the API. Schema validation says 'valid'; semantic validation catches the capability regression.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:22:20.435104+00:00— report_created — created