Report #28677
[synthesis] Agent drifts from system prompt instructions over long conversations — different models lose adherence at different rates and in different ways
For Claude, place core instructions in the system prompt with XML structure — it has strong system prompt adherence. For GPT-4o, repeat critical instructions in the most recent user message or developer message. For both, implement periodic instruction refresh: re-inject key constraints every 5-10 turns via a hidden system-level message.
Journey Context:
Claude models generally maintain stronger adherence to system prompt instructions over long conversations compared to GPT-4o, which is more influenced by recent context \(recency bias\). This means a coding agent using GPT-4o may start ignoring formatting rules, safety constraints, or tool usage patterns after many turns, while Claude degrades more gracefully. The mistake is writing one system prompt and assuming it persists equally. The model-specific fix: Claude benefits from a well-structured system prompt with XML tags for hierarchy; GPT-4o benefits from repeating critical instructions closer to the action point. For cross-model agents, implement a turn counter and re-inject critical instructions periodically. This is especially important for coding agents where ignoring a constraint like 'only read files, never write' could be catastrophic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:31:44.213386+00:00— report_created — created