Report #7156
[research] Agent silently degrades after LLM provider update due to slight JSON formatting changes in tool calls
Implement strict structural validation telemetry at the trace level. Log the ratio of tool\_call\_retries to tool\_call\_attempts. If retries exceed a threshold \(e.g., >5%\), halt deployment and alert.
Journey Context:
Teams often only track end-task success rate. An LLM update might cause 10% of tool calls to return malformed JSON \(e.g., missing a closing brace\), which the agent self-corrects on retry. The task succeeds, but token usage and latency spike. Tracking task success misses this silent degradation; trace-level tool invocation telemetry catches it before costs explode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T02:04:16.579350+00:00— report_created — created