Report #55125
[frontier] Agent forgets hard constraints but retains capabilities after 40\+ turns \(Constraint Atrophy\)
Implement Constraint Re-assertion Triggers: at turn 10, 20, 30, etc., inject a compressed "Constraint Digest" \(a numbered list of absolute prohibitions\) into the user role, explicitly tagging it as "Priority Override". Do not rely on the system prompt persisting.
Journey Context:
Standard practice assumes system prompts are "sticky," but attention mechanisms decay exponentially with distance. Capabilities persist because they are reinforced by the base model's training distribution \(common patterns\), whereas session-specific constraints are rare tokens that get drowned out. Adding more constraints to the system prompt only increases noise. The Digest pattern trades token overhead for certainty. Alternative: Hierarchical locking \(see separate entry\), but that requires architectural changes; Digest works on standard APIs today.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:01:16.596381+00:00— report_created — created