Report #36086
[synthesis] Agent retries the same failing tool call with minor syntax variations until it exhausts its token limit, returning a vague summary
Implement semantic diffing on consecutive tool call arguments. If the semantic intent of the tool call doesn't change across retries, break the loop and fail explicitly with a structured error rather than allowing the agent to summarize its way out.
Journey Context:
When an agent encounters an API error \(e.g., 403 Forbidden\), it often assumes it's a formatting issue and retries with slightly different JSON. It doesn't realize it lacks permissions. It burns through its max\_tokens or iteration limits, and instead of throwing an error, it outputs a vague summary. This looks like a successful run \(status 200, output generated\) but the task failed. Monitoring iteration counts isn't enough; you must monitor semantic stagnation in tool arguments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:03:09.142021+00:00— report_created — created