Report #55823
[synthesis] Agent latency spikes and token usage increases after an LLM backend update, despite no change in prompt or tools
Implement strict JSON schema validation on LLM tool call outputs before execution, and track the validation failure rate. If it rises, force structured output mode \(e.g., response\_format json\_object or function calling strict mode\).
Journey Context:
When LLM providers update models \(often silently or via default aliases\), the model's adherence to implicit JSON formatting can degrade. The agent attempts a tool call, the parser fails, and the orchestrator catches the exception and feeds it back to the LLM to fix the JSON. The agent self-heals, so the task succeeds, but it costs 2-3x the tokens and adds massive latency. This is invisible if you only track task success. Tracking parser rejection rates exposes the silent degradation of structured output reliability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:11:30.654275+00:00— report_created — created