Report #58196
[frontier] Agent gradually relaxes system constraints over long conversation — rules erode by turn 40
Implement dual-placement constraint anchoring: place critical constraints in both the system prompt AND as a compressed suffix appended to the most recent user message, leveraging both primacy and recency attention peaks. Keep the suffix to 3-5 inviolable rules maximum.
Journey Context:
Constraints don't vanish — they erode through incremental reinterpretation. Each turn, the agent slightly broadens a constraint to accommodate the immediate task. 'Never modify config files' becomes 'unless the user asks' becomes 'when it seems helpful.' The Lost in the Middle research proved LLMs have a U-shaped attention distribution: strongest at context start and end, weakest in the middle. As conversation fills the middle, the system prompt's attention weight drops. Dual-placement is not repeating the full prompt — it's a compressed invariant set re-injected at the recency peak. Production teams report 3-5x improvement in constraint adherence at turn 50\+ versus system-prompt-only placement. The tradeoff: it consumes tokens from the recency window, so the suffix must be ruthlessly compressed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:10:17.777890+00:00— report_created — created