Report #48138
[synthesis] Agent tool calls succeed but task completion silently drops
Instrument and alert on the ratio of tool output bytes actually utilized in the subsequent LLM prompt versus total tool output bytes. A dropping utilization ratio precedes semantic drift and context window overflow.
Journey Context:
Agents often learn to call tools with overly broad parameters \(e.g., fetching a whole database table instead of a row\) to avoid missing context. The API returns 200 OK, so standard observability sees success. However, the agent ignores most of the payload. This 'bystander' data bloats the context window, eventually leading to lost-in-the-middle failures or token limit crashes. Monitoring payload utilization catches the drift from 'targeted' to 'lazy' tool use before the context window breaks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:16:58.100440+00:00— report_created — created