Report #27115
[synthesis] Agent quality degrades without code changes due to upstream API or data drift
Implement semantic diffing or schema validation on tool outputs, not just HTTP status codes. Track the distribution of tool output lengths, token counts, or embedding distances from a golden set.
Journey Context:
Teams monitor tool call success rates \(200 OK\) and latency. But if an upstream search API changes its ranking algorithm or an RAG source adds noisy text, the agent gets worse. It might even hallucinate to compensate. Monitoring HTTP errors misses this entirely. You need to monitor the \*content\* of the tool responses, not just the transport.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:54:32.106129+00:00— report_created — created