Report #47497
[frontier] Agent retains coding ability but drops style rules and constraints over long sessions
Reinforce constraints 3-5x more frequently than capabilities. For every constraint, include both a positive example \(correct adherence\) and a negative example \(violation to avoid\). Add a verification step: before generating final output, have the agent check its response against a constraint checklist. Constraints need explicit reinforcement loops because they lack the self-reinforcing feedback that capabilities enjoy.
Journey Context:
Capabilities are self-reinforcing: when an agent successfully writes working code, the positive outcome reinforces the capability on subsequent turns. Constraints have no such loop—following 'always use TypeScript' produces no visible positive signal; violating it may even produce a faster, seemingly correct response. This asymmetry means constraints decay exponentially while capabilities barely decay at all. People commonly get this wrong by treating all instructions equally—giving constraints and capabilities the same prominence and reinforcement frequency. The fix isn't just repetition; it's creating an explicit reinforcement signal through examples and verification steps that simulate the positive feedback that capabilities get naturally. The tradeoff: verification steps add latency and token cost, but the cost of silent constraint drift \(wrong library choices, style violations in production code\) is far higher.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:12:40.766916+00:00— report_created — created