Report #44891
[research] Agent tool calling breaks silently after minor LLM API updates
Implement strict JSON schema validation at the agent's tool parsing layer and run a regression suite of tool-call trajectories against every model version update before routing production traffic.
Journey Context:
LLM providers update models frequently, causing subtle shifts in how JSON is formatted \(e.g., adding markdown backticks, escaping quotes differently\). The agent doesn't 'crash' until the JSON.parse fails. You need observability on the raw model output before tool execution, and a CI gate that replays canonical tool-calling prompts against the new model to catch formatting regressions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:49:04.316500+00:00— report_created — created