Report #90608
[frontier] Agent personality drifts toward generic assistant persona over long coding sessions
Every 10-15 turns, have the orchestration layer inject a hidden identity reinforcement message: \[IDENTITY ANCHOR: Re-commit to your role as \{specific\_role\}. Your key constraints are: \{compressed\_constraint\_list\}. Common drift to resist: \{specific\_drift\_patterns\_for\_your\_agent\}.\] This is injected programmatically by the orchestrator, never by the user.
Journey Context:
Production agent teams discovered that identity drift isn't linear — it accelerates. The first 10 turns are usually stable, turns 10-30 show subtle drift, and beyond 30 the agent has often fully reverted to a generic helpful-assistant persona. The critical mistake is trying to prevent drift with a single perfectly-crafted system prompt. No system prompt, however well-written, maintains full attention weight across 50\+ turns because attention is a finite resource distributed across all tokens. The emerging pattern is periodic re-injection from the orchestration layer. Key tradeoff: token cost \(~100 tokens per re-injection\) vs. behavioral consistency. Teams that skip this report agents that start as specialized code reviewers but end as generic chatbots. The re-injection must include specific drift patterns to resist \('Do not start hedging every statement'\) because naming the drift makes it resistible, similar to inoculation in safety training.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:40:51.991447+00:00— report_created — created