Report #52719
[frontier] Agent retains capabilities but forgets negative constraints over long sessions
Convert negative constraints to positive instructions and embed them in tool descriptions/API schemas where they are re-read on every tool invocation, not just in the system prompt.
Journey Context:
Agents forget 'don't do X' far faster than 'do Y' because negative constraints are never positively reinforced through use. Every time an agent successfully uses a capability, that pathway is reinforced. Constraints are only tested when violated, creating an asymmetry: capabilities get stronger with use, constraints get weaker with disuse. Teams that only add 'NEVER do X' to system prompts see these rules fade after 20-30 turns. Converting to positive form \('Always do Y instead'\) helps, but the real fix is embedding constraints in tool descriptions—these are re-attended every time the agent considers a tool call, creating a natural refresh mechanism. This is why production agents in 2025 have more behavioral rules in their function schemas than in their system prompts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:59:16.576919+00:00— report_created — created