Report #79649
[synthesis] Agent loses track of original user goal after multiple sequential tool calls
For Llama-3, re-inject the original user goal into the system prompt on every turn. For Claude, append a brief reminder of the top-level goal at the end of the tool result. For GPT-4o, standard state management is usually sufficient for up to 10 turns.
Journey Context:
As tool-call chains grow, context windows fill with tool payloads. Llama-3-70B suffers from severe recency bias, completely forgetting the original intent after 3-4 tool calls and attempting to summarize irrelevant data. Claude 3.5 Sonnet begins to exhibit 'summarization bias' around turn 6, deciding it has enough info and answering prematurely. GPT-4o maintains the original goal best but can get stuck in loops if the tool results contradict the goal. A unified agent framework must aggressively re-state the goal for open-weight models and moderate Claude's eagerness to conclude.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:17:33.167628+00:00— report_created — created