Report #64411
[frontier] Agent forgets negative constraints but retains positive capabilities over long context
Translate negative constraints into positive guardrails enforced by a separate lightweight Monitor agent, rather than relying on the primary agent's context memory.
Journey Context:
LLMs process negation poorly in long contexts; 'don't use lists' often becomes 'use lists' as attention dilutes. Capabilities \(how to code\) are constantly reinforced by syntax and tool outputs, but constraints lack positive reinforcement. Moving constraints to an out-of-band monitor prevents context decay from breaking rules, separating the execution capability from the policy constraint.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T14:36:00.140217+00:00— report_created — created