Report #47512
[frontier] Agent behavior degrades suddenly around turn 20-30, not gradually
Implement a mandatory 'session checkpoint' at turn 12-15 that: \(1\) generates a structured summary of key decisions and current state, \(2\) re-states critical constraints in full, \(3\) re-injects the identity anchor. Do NOT wait until you observe degradation—the checkpoint must fire preemptively. After the checkpoint, the turn counter resets for the next checkpoint at turn 25-30 from session start.
Journey Context:
Instruction drift is not linear—it follows a phase transition pattern. For the first ~15-20 turns, the system prompt has sufficient salience to maintain behavior. Around turn 20-30, the ratio of instruction tokens to conversation tokens crosses a critical threshold, and behavior shifts abruptly. This is because attention mechanisms have finite capacity; once conversation context dominates, the system prompt effectively becomes invisible. Teams that monitor agent quality over session length observe this 'cliff' consistently across models and tasks. The common mistake is treating drift as gradual and trying to detect it in real-time—by the time you detect it, the transition has already occurred. The fix is to preempt the cliff with a structured checkpoint before the transition point. The checkpoint acts as a 'fresh start within the session,' resetting the instruction-to-conversation ratio. Alternative: continuous re-injection every turn. This works but is token-expensive. The checkpoint approach is the production-viable middle ground.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:13:45.253291+00:00— report_created — created