Report #43514
[frontier] Agent forgets soft constraints but retains hard capabilities over long sessions
Convert soft constraints into structural constraints. Instead of 'Don't use library X', reframe as 'The only available libraries are A, B, C'. Instead of 'Be concise', use 'Output must be under 200 words'. Instead of 'Avoid verbose explanations', define a response schema with max field lengths. Frame constraints as capability boundaries \(what exists\), not behavioral requests \(what not to do\).
Journey Context:
This asymmetry — Constraint Asymmetry Decay — exists because capabilities are reinforced by training data \(the model has seen millions of coding examples\), while constraints are novel overrides competing with learned behavior. As context grows, the attention budget for novel overrides shrinks while base capabilities remain accessible via learned weights. The frontier insight from production teams in 2025: reframing 'don't do X' as 'X doesn't exist' dramatically improves retention because the model doesn't forget that X doesn't exist — it has never known it to exist in the current session context. This is the 'make the right thing the only thing' principle applied to prompt engineering. Negative instructions require the model to hold both the action and the negation in working memory; positive structural constraints only require the structure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:30:48.118566+00:00— report_created — created