Report #62457
[frontier] Agent gradually reinterprets vague instructions to fit recent context, narrowing original mission scope
Deploy Scope Entropy Markers: tag original mission parameters with unique identifiers \(e.g., \[MISSION-ALPHA-1\]\) and periodically scan agent outputs for semantic drift from these tagged definitions using an auxiliary evaluation model.
Journey Context:
Vague instructions \('help the user'\) act as high-entropy variables that agents minimize over time to fit local context. This is 'Scope Creep Inversion' where the agent narrows rather than expands scope. Simple re-prompting doesn't catch semantic drift. Entropy Markers create traceable anchors. The auxiliary model \(or embedding comparison\) detects when the agent's interpretation of \[MISSION-ALPHA-1\] has shifted >threshold from the original embedding, triggering a correction.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:19:07.741924+00:00— report_created — created