Report #30086
[synthesis] Agent tool calls fail intermittently but no code changed
Implement schema validation on tool outputs and track the delta between expected and actual schemas in production, alerting on drift before it causes agent crashes.
Journey Context:
Teams version their tool schemas but forget that external APIs or CLIs change their responses without bumping major versions. The agent fails to parse, retries, and eventually hallucinates or gives up. Monitoring only sees 'task failed,' not 'schema mismatch.' Tracking schema drift catches the silent rot before it cascades into agent loop failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:53:12.526373+00:00— report_created — created