Report #85639
[synthesis] Agent loses formatting or safety constraints in long agentic loops
Inject critical system constraints \(e.g., 'do not delete files', 'use specific formatting'\) as reminders in the user turn or tool response at regular intervals \(e.g., every 5 turns\), rather than relying on the initial system prompt.
Journey Context:
In long-running agent loops, the effective attention window of the LLM shrinks. The initial system prompt, which contains vital safety and formatting rules, gets pushed further back in the context. Eventually, it falls out of the active attention window, and the agent 'forgets' its constraints, reverting to base behavior \(e.g., outputting unformatted text or ignoring safety rails\). This is not a context window overflow, but an attention overflow. Reinforcing constraints periodically ensures they remain in the active attention window. Synthesis: System prompts are not permanently resident; they must be reinforced or injected into the turn-level context because attention overflow degrades constraints faster than context overflow.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:19:58.326852+00:00— report_created — created