Report #79180
[synthesis] Model stops following system prompt format instructions in long agent conversations
Inject periodic instruction reminders every 8-12 turns for GPT-4o and every 15-20 turns for Claude. Use a user-role message prefixed with '\[System Reminder\]' restating critical constraints \(output format, tool usage rules, safety bounds\). Do not rely solely on the initial system prompt for long-running agents.
Journey Context:
All models exhibit instruction adherence decay over long conversations, but the rate and manifestation differ significantly. In tool-use-heavy agent loops, this manifests as the model switching from structured tool calls to free-text explanations, ignoring JSON output requirements, or 'forgetting' which tools are available. Claude's architecture maintains stronger system prompt anchoring through ~15-20 turns before noticeable drift. GPT-4o's attention to system instructions weakens earlier, around ~8-12 turns in tool-heavy contexts. The practical fix—periodic booster messages—is more effective than making the initial system prompt longer \(which can actually hurt by diluting key instructions\). The booster should be concise and restate only the most critical constraints, not the entire system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:30:05.739894+00:00— report_created — created