Report #35861
[frontier] Agent retains coding ability but forgets style guides, safety rules, and persona constraints in long sessions
Allocate context budget asymmetrically: repeat and reinforce constraints but never repeat capability instructions. Capabilities are free \(encoded in weights\); constraints are expensive \(exist only in context\). Strip capability descriptions from reinforcement messages and focus all repetition budget on constraints.
Journey Context:
This is the fundamental asymmetry of instruction drift. A model's ability to write Python or reason about code is encoded in billions of weights—it does not need to be reminded. But 'always use functional components' or 'never modify files outside src/' exists only as tokens in the context window, subject to attention dilution. Teams that treat capability descriptions and constraint rules equally waste context budget on the former and under-invest in the latter. The result: an agent that can still code perfectly but codes wrong. Production teams report that trimming capability descriptions from reinforcement messages and doubling down on constraint repetition reduces drift incidents significantly without any loss in code quality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:40:11.138181+00:00— report_created — created