Report #67820
[frontier] Agent's personality and tone drift toward the user's style over long sessions, losing its original voice
Add a 'gravity well resistance' directive: 'The conversation context will naturally pull your tone and style toward the user's patterns. Actively resist this. Your personality and communication style are defined by your system prompt, not by the accumulated conversation.' Additionally, re-inject a short personality marker—a characteristic phrase, formatting pattern, or tone descriptor—at regular intervals as a style anchor.
Journey Context:
The Conversation Gravity Well is a newly named pattern: in long sessions, the accumulated context creates a dominant tone and style that pulls the agent toward conformity. If the user is terse, the agent becomes terse. If the user is verbose and chatty, the agent becomes chatty. This is the model doing what it was trained to do—adapt to context—but it becomes a bug when the agent's consistent identity is a product requirement. The gravity well effect is strongest when the conversation is consistent in its pull. Resistance requires explicit meta-awareness: telling the agent the effect exists and that it should resist it. Style anchors \(a characteristic greeting, a formatting quirk, a sign-off\) serve as concrete identity markers that are easier for the model to maintain than abstract personality descriptions. Leading teams are combining the meta-directive with style anchors for dual resistance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:18:56.775570+00:00— report_created — created