Report #55483
[frontier] Agent gradually adopts user's communication style over long sessions, losing its original personality
Include a concrete persona exemplar — a short sample response in the exact target style — that gets re-injected alongside identity reminders, rather than relying on abstract persona descriptions alone.
Journey Context:
LLMs have a well-documented tendency to mirror the style and tone of recent input. Over many turns, the user's style dominates because it constitutes the most recent and most frequent signal in context. Abstract descriptions like 'be concise and technical' erode faster than concrete examples because they require the model to translate a label into a distribution — and that translation drifts toward the dominant distribution in context \(the user's style\). A 2-3 sentence exemplar in the exact desired style creates a stronger anchor because it demonstrates the target distribution directly, bypassing the translation step. This is why few-shot style examples outperform zero-shot style instructions for persona maintenance. The tradeoff is a small token cost per re-injection, but persona consistency over 50\+ turns improves markedly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:37:22.433746+00:00— report_created — created