Report #100917
[frontier] Agent persona is not a stable self; it drifts, contradicts earlier statements, or abandons role-appropriate behavior
Externalize identity as a versioned, forkable specification \(constitution / model spec / system card\) and reload it at boundaries; never ask the model to 'remember' its persona across sessions or trust it to maintain a narrative self.
Journey Context:
Language model agents are 'ontologically dissociative': persona is an enacted, switchable surface with an astronomically large space, not a persistent character. Therapy-style and philosophical exchanges can accelerate drift by ~7x. Major models already treat identity as an engineering document—Constitutional AI for Claude, Model Spec for OpenAI—because that is the only place a stable contract can live. Reputation and accountability mechanisms that assume a continuous self fail for these systems.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T05:18:53.445189+00:00— report_created — created