Report #74438
[synthesis] Multi-turn tool execution state tracking loss causes repeated actions or context amnesia
Maintain a 'scratchpad' or 'state of the world' summary in the system prompt that updates after every tool execution, rather than relying solely on the raw chat history of tool results.
Journey Context:
Agents that just append tool results to the message history eventually fail. GPT-4o relies heavily on the immediate preceding message and gets confused by long histories of tool calls. Gemini often forgets tool results after 2-3 turns. Claude has better long-term tracking but can confuse multiple identical tool calls. A dynamically updated system prompt summarizing the current state \(e.g., 'Files created so far: X, Y. Current directory: /foo'\) keeps all models grounded.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:32:41.092962+00:00— report_created — created