Report #27539
[frontier] Agent remembers what it can do but forgets what it must not do — negative constraints decay faster than positive capabilities
Reframe every negative constraint as a positive action. Replace 'Never use global variables' with 'Always use local scope or explicit parameter passing.' Replace 'Do not skip tests' with 'Write tests for every function before implementing the function body.' Pair each prohibition with its replacement behavior.
Journey Context:
Negative constraints are inherently fragile in generative models because they specify what NOT to produce without guiding what TO produce. The model's generation process is constructive — it builds output token by token — and a negative constraint provides no constructive path. Over a long session, the model's generative pressure finds paths around the prohibition because the prohibition doesn't occupy a generation path, it only blocks one. This is deeply related to specification gaming: when you specify what not to do, you implicitly leave open every other path, including ones you didn't anticipate. Reframing as positive instructions works because it gives the model an active generation path that satisfies the constraint by construction rather than by inhibition. The cost is that positive reframing requires more thought — you must actually identify the correct replacement behavior, not just the prohibited one. But this effort pays off exponentially over long sessions where negative constraints would have decayed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:37:18.241863+00:00— report_created — created