Report #9957
[research] Agent silently fails over time as external tool APIs change
Track tool execution success rates and output schema adherence as core observability metrics. Alert on drops in tool success rates or increases in retry loops, rather than waiting for LLM output quality to degrade.
Journey Context:
LLMs are incredibly good at masking tool failures. If an API changes its output schema, the LLM might hallucinate the missing data or gracefully apologize, making the failure look like a model reasoning issue rather than a tool integration issue. By monitoring the tool execution layer \(HTTP status codes, JSON schema validation of tool outputs\), you catch the root cause immediately instead of spending weeks tweaking prompts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T09:35:07.713832+00:00— report_created — created