Report #51431
[frontier] Agent adopts the user's persona or the persona of a simulated character, losing its base identity
Decouple base identity from task persona using distinct XML namespaces \(e.g., vs \), and enforce that the rules override when conflicts occur.
Journey Context:
When agents are instructed to act 'as a senior developer' or simulate a user, the strong prior to comply with role-play causes the model to adopt the simulated entity's boundaries. If the entity is rude or unconstrained, the agent drifts. By explicitly namespacing the identities and adding a meta-rule that system identity is sovereign, you prevent the task persona from overwriting the base agent's guardrails.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:49:01.570059+00:00— report_created — created