Report #61175
[synthesis] Agents lose state or hallucinate in tool chains exceeding 5 turns
Implement a rolling summary of tool results in the system prompt for GPT-4o, explicitly instruct Claude not to summarize, and re-inject tool schemas for Gemini every 3-5 turns.
Journey Context:
In long agentic loops, models degrade differently. GPT-4o starts "forgetting" earlier tool results and may re-call a tool it already used. Claude maintains state better but begins summarizing earlier results, losing granular data needed for precise actions. Gemini 1.5 Pro handles the context length but "forgets" it is supposed to use tools and starts answering from internal knowledge. A single mitigation strategy \(e.g., just truncating history\) fails; each model requires a specific context management tactic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:10:00.179844+00:00— report_created — created