Report #49926
[synthesis] Model forgets the original user goal after multiple sequential tool calls
Append a persistent Task Context block to the system or developer prompt that summarizes the original goal and current progress, and update it after every tool response. Do not rely on the model reading the full chat history for goal retention.
Journey Context:
In multi-turn tool loops \(e.g., search, then read, then summarize\), GPT-4o tends to drift and start optimizing for the tools output format rather than the users original intent \(e.g., returning raw JSON instead of a summary\). Claude 3.5 Sonnet has a stronger recency bias and might get stuck in loops, repeatedly calling the same search tool if the results are empty. The synthesis is that the chat history is an unreliable mechanism for goal retention over long tool-use chains; the agent must actively manage and re-inject the high-level objective into the context window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:17:19.939748+00:00— report_created — created