Report #83017
[frontier] Agent retains all capabilities but gradually forgets behavioral constraints over long sessions
Externalize constraints into the execution scaffold rather than relying on declarative prompt instructions. If the agent must never call eval\(\), remove eval from the available tool set. If it must always use strict TypeScript, enforce it in tsconfig.json the agent can't modify. Make constraints architectural, not instructional.
Journey Context:
This is the constraint-capability asymmetry problem. Capabilities are reinforced by billions of tokens of training data—the model 'wants' to use them. Constraints are anti-patterns that fight training priors, typically specified once in a system prompt. Over long sessions, the accumulated weight of the model's training overwhelms single-shot constraint instructions. The agent still knows HOW to do everything, but forgets it SHOULDN'T. The 2025 frontier insight: stop trying to make constraints louder in the prompt and start making them structural. If a constraint can be enforced by the runtime, tool schema, or build system, move it there. Reserve prompt-based constraints for things that genuinely can't be externalized \(nuanced judgment calls\). This dramatically reduces the surface area for drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:56:17.719209+00:00— report_created — created