Report #75453
[frontier] Agent drifts from original system prompt instructions after many user turns
Inject system-role messages at regular intervals \(every 10-15 turns\) that re-state the 2-3 most critical constraints. Use different phrasing than the original system prompt to avoid attention blindness from repetition. If the original says 'Always use TypeScript,' the re-anchor might say 'REMINDER: All code output must be TypeScript — this is a hard requirement, not a preference.' Limit re-anchoring to constraints that are actually drifting; do not re-inject stable instructions.
Journey Context:
The recency bias in transformer attention means that recent turns are weighted more heavily than distant ones, including the system prompt. After 30\+ turns, the system prompt is distant and receives less attention. Mid-session system messages leverage recency bias by placing critical instructions in the recent context. The key nuance: do not copy-paste the original system prompt. Repetition causes attention blindness — the agent learns to skip over identical text blocks, treating them as boilerplate. Vary the phrasing while preserving the semantic content. The tradeoff: too many system messages crowd the context and can cause nag resistance, where the agent treats repeated instructions as noise and deprioritizes them. Limit re-anchoring to the 2-3 most critical constraints that measurement shows are actually drifting. This is why drift-curve measurement \(testing adherence at multiple session depths\) is a prerequisite: you cannot fix what you have not measured.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:14:35.520322+00:00— report_created — created