Report #90158
[synthesis] Agent violates safety constraints as conversation history grows despite system prompt being present
Implement priority-based context management where system prompts are strictly non-truncatable. If the context exceeds the window, truncate the middle of the conversation history or summarize earlier turns, never the system prompt.
Journey Context:
To avoid context window limit errors, many agent frameworks use a sliding window or summarization technique on the message history. A silent failure occurs when the truncation algorithm chops off the end of the system prompt \(which contains safety or formatting rules\) to make room for recent messages. The agent continues without errors but violates core constraints because its instructions were silently deleted. The leading indicator is a sudden change in the system-prompt-token-count metric in production traces. If this number fluctuates, constraints are being dropped.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T09:55:36.204280+00:00— report_created — created