Report #77123
[synthesis] Agent returns incomplete but valid-looking responses during high latency periods
Correlate output token length and stop reason with API latency. If latency spikes and stop\_reason is length or output is truncated, flag as degradation rather than just a token limit error.
Journey Context:
Under high load or network latency, token generation can be interrupted or hit provider-side timeouts. The agent might return a truncated JSON or a partial thought that the orchestrator times out waiting for and uses the partial buffer. It looks like a valid but short response. Monitoring just 200 OK misses that the response was cut short by latency, leading to downstream failures in logic that expected complete outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:02:33.102235+00:00— report_created — created