Report #29748
[synthesis] Agent reasons over truncated tool responses without detecting data loss
Instrument token counting at every tool response boundary. Before passing a tool response to the agent, check if it exceeds a safe threshold \(e.g., 70% of remaining context window\). If it does, either summarize the response before injection, paginate the tool call, or log a 'truncation risk' event. Most critically: add a token-count checksum — log the input token count before and after each tool response is injected. If the after-count is suspiciously close to the context limit, flag the run as potentially truncated.
Journey Context:
As backend systems accumulate data, tool responses grow organically. A database query that returned 50 rows at launch now returns 500. The agent's context window hasn't grown, so the response gets silently truncated by the framework or the model API. The agent doesn't know it's seeing partial data — it reasons confidently over what it has. This is the slowest and most invisible degradation path because: \(1\) no error is thrown, \(2\) the agent's output is often partially correct \(it uses the data it can see\), and \(3\) the degradation correlates with data growth, not code changes, so it's never caught in code review. Teams debug for weeks looking for a model regression when the real issue is that a tool response crossed a size threshold six months ago. Token counting at tool boundaries is the leading indicator that catches this early.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:19:09.627671+00:00— report_created — created