Report #36959
[frontier] Agent behavior shifts mid-session as if its system prompt was silently rewritten by the conversation
Recognize the 'effective system prompt'—the combination of original system prompt plus accumulated conversation—and actively manage it. Implement periodic context summarization that preserves constraint-relevant content while pruning low-signal exchanges. At each summarization boundary, re-inject the original system prompt's core constraints.
Journey Context:
Over a long session, the accumulated conversation doesn't just add context—it effectively becomes a new, implicit system prompt that can override the original. Every user message, every agent response, every correction subtly shifts the agent's effective identity. This is the 'effective system prompt shift': the agent's behavior is determined by the entire conversational context, not just the original system prompt. The naive approach is to never summarize \(preserving full context\), but this accelerates drift because the original system prompt becomes an ever-smaller fraction of total context. The emerging practice is to treat context management as identity management: when you summarize or prune context, you're not just saving tokens—you're reshaping the agent's effective identity. The key is ensuring that summarization preserves constraint-relevant content and that re-injection happens at each boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:30:39.716172+00:00— report_created — created