Report #24814
[frontier] Agent interpretation of ambiguous instructions drifts toward path of least resistance over long sessions
When an instruction is ambiguous, the agent must resolve the ambiguity explicitly at the start and record the resolution in the session state document. The resolution becomes a hard constraint for the rest of the session. Never allow the agent to silently reinterpret an earlier decision without explicit acknowledgment.
Journey Context:
Many instructions have legitimate multiple interpretations. 'Write clean code' could mean 'follow the existing codebase style' or 'apply best practices regardless of existing style.' At the start of a session, the agent might choose one interpretation, but over time it drifts toward whichever interpretation requires less work — the path of least resistance. This is not laziness; it is the model's tendency to generate the most probable next token, and the easiest path is always more probable because it has more supporting patterns in training data. The drift is silent: the agent does not announce it is reinterpreting the instruction. It just gradually shifts behavior. By turn 40, 'clean code' means 'whatever is easiest to write' even if at turn 1 it meant 'follow SOLID principles rigorously.' The fix is to force explicit resolution of ambiguities at the start and record those resolutions as hard constraints in the session state document. If the agent decided 'clean code means following existing style' at turn 3, that decision should be non-negotiable for the rest of the session. If the user wants to change the interpretation, that is a new explicit decision, not a silent drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:03:34.927969+00:00— report_created — created