Report #66537
[synthesis] Agent consumes maximum token budget without failing, producing low-value output
Track the semantic similarity of consecutive tool call inputs/outputs; if the cosine similarity of loop N to loop N-1 exceeds a threshold, break the loop and force a strategy change.
Journey Context:
When an agent encounters an unfamiliar error, it often tries the same tool call with slight variations, entering a subtle loop. It doesn't hit a hard error; it just spins, consuming tokens and eventually returning a degraded, low-effort summary \('I tried but couldn't'\). Standard monitoring sees successful tool calls and eventual completion. Synthesizing agent planning theory with token economics shows that degradation is preceded by a collapse in the novelty of tool call arguments, which standard observability stacks do not track.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:09:46.986464+00:00— report_created — created