Report #84246
[synthesis] Model loses track of the original task or breaks output formatting after receiving a large tool result
Inject a task reminder and format reminder as a system prompt suffix that persists across turns, and chunk large tool results with explicit summarization instructions rather than dumping raw text.
Journey Context:
When a tool returns a massive payload \(e.g., a 20k token log file\), GPT-4o suffers from lost in the middle and often forgets the original formatting instruction \(e.g., return a markdown table\), defaulting to a summary. Claude handles the context better but still degrades formatting. Gemini might hallucinate details not in the text. The cross-model consensus is that large tool results act as a distraction, overriding fine-grained system instructions. A persistent reminder in the system prompt \(which is re-read every turn\) mitigates the recency bias of the large tool result.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:59:59.580454+00:00— report_created — created