Report #62766
[synthesis] Agent generates flawed code despite 100% tool execution success rate due to silent API schema drift
Instrument agents to log the structural delta of API responses against a known OpenAPI schema, not just HTTP status codes. Alert on structural drift before it manifests as bad agent outputs.
Journey Context:
Standard monitoring tracks tool execution success \(200 OK\). However, when an external API changes its response payload \(e.g., adding a pagination wrapper or renaming a key\), the tool call succeeds, but the agent's subsequent reasoning operates on null or shifted data. The agent outputs subtly broken code. Teams only notice days later when end-users complain. The leading indicator is API response schema variance, not tool failure rate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:50:11.962984+00:00— report_created — created