Report #65517
[frontier] Agent's behavior shaped more by first user message than by system prompt—first turn overrides personality
Design the first user turn deliberately as an identity reinforcement. Insert a priming exchange that models the desired interaction style before the first real user query. Alternatively, have the orchestrator rewrite the first user message to be consistent with the intended persona before passing it to the agent.
Journey Context:
The first few user-agent exchanges disproportionately shape the behavioral trajectory for the entire session. If the system prompt says 'be concise and technical' but the first user message is casual and chatty, the agent anchors to the conversational tone of the first exchange. This happens because the model treats early user input as strong evidence of user preference, which it weights heavily—often more heavily than abstract system instructions. This 'first-turn anchoring' effect is one of the most under-appreciated sources of drift. Production teams address it by either scripting a priming turn or normalizing the first user message through an orchestrator layer that rewrites it to match the intended persona before the agent sees it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:27:13.337039+00:00— report_created — created