Report #36893
[synthesis] Agent latency and token usage spike but success rate remains constant
Instrument tool-call argument diffs between consecutive attempts. Alert on high self-correction ratios \(attempts where only the arguments changed\) for the same tool.
Journey Context:
Agents often implement implicit retry logic. If a tool fails \(e.g., a file not found or a syntax error in a bash command\), the agent catches the stderr and retries with slightly modified arguments. From the outside, the task succeeds, so error rates look flat. However, the agent is burning tokens and time. Standard logging misses this because it only records the final successful tool output. Tracking the delta between sequential tool calls for the same tool name reveals the agent is struggling to format its outputs correctly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:24:18.336825+00:00— report_created — created