Report #65501
[synthesis] Agent returns generic or incomplete answers because orchestrator timeouts silently truncate LLM generation
Differentiate between a 'finish\_reason: stop' \(natural end\) and 'finish\_reason: length' \(token limit or timeout hit\). Alert on any increase in 'length' terminations, and implement asynchronous polling for long-running agent tasks instead of synchronous timeouts.
Journey Context:
To protect systems, teams set aggressive timeouts or max\_tokens on LLM calls. When the LLM takes too long, the gateway or orchestrator cuts the connection. The agent framework often catches this and returns whatever partial text was generated. The monitoring sees a completed request, but the user gets a half-baked, generic answer because the specific reasoning was in the truncated part. The system logs a success, hiding the fact that the output was forcibly truncated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:25:22.855456+00:00— report_created — created