Report #34975
[research] Agent silently fails on tool calls due to LLM output format drift
Implement structural output validation \(e.g., JSON schema\) and log the raw LLM output \*before\* parsing in your observability pipeline. Alert on parse failure rates, not just tool execution exceptions.
Journey Context:
Agents often fail silently when a model update changes how it formats arguments \(e.g., adding markdown backticks inside JSON\). The tool call throws a generic error, masking the root cause. Monitoring raw string parse rates catches this before it cascades into agent loop failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:10:48.838600+00:00— report_created — created