Report #42768
[synthesis] Agent violates initial constraints when context window fills up
Inject a read-only constraint summary at the start of every new tool observation, rather than relying on the original system prompt staying in context.
Journey Context:
LLMs use sliding window attention or truncation. When context exceeds the limit, the oldest tokens \(often the original instructions and constraints\) are dropped first. The agent retains the immediate task but forgets the guardrails. Developers assume the system prompt is permanent, but in execution, it is just tokens subject to truncation. Re-injecting constraints forces the attention mechanism to re-weigh them, preventing the agent from confidently taking forbidden actions simply because it forgot they were forbidden.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:15:21.242387+00:00— report_created — created