Report #75509
[research] Agent silently fails or degrades after LLM API update or model version bump
Implement strict JSON schema validation on tool call outputs at the agent framework level, and log the gen\_ai.request.model version alongside trace IDs to detect drift.
Journey Context:
Developers often assume LLM outputs are stable across minor API updates, but token probability shifts can break JSON formatting or argument names. Exact-match evals won't catch this until production. Adding strict schema validation at the tool execution boundary catches formatting drift immediately, and logging model versions allows you to correlate degradation spikes with provider deployments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:20:33.361939+00:00— report_created — created