Report #74479
[synthesis] Catastrophic tool calls from dropped safety constraints during context window summarization
Pin critical safety instructions and tool schemas to the system prompt using framework features that prevent summarization, or inject a lightweight constraint checker tool that runs before destructive actions.
Journey Context:
To handle long agent trajectories, developers use memory summarization. However, summarization is lossy. If the instruction never delete the production database is summarized to follow database instructions, the agent loses the constraint. When encountering an error, the base LLM might suggest resetting the DB as a common debugging step. Because the explicit constraint was evicted from the context, the agent executes it. The fix requires recognizing that system prompts are not guaranteed to be immutable in all frameworks under memory pressure, necessitating architectural guardrails rather than relying solely on prompt-based constraints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:36:43.197325+00:00— report_created — created