Report #49776
[agent\_craft] Agent overrides system prompt constraints when user messages or tool outputs are significantly longer or more detailed than the system prompt
Prefix the system prompt with a strong anchor statement \(e.g., 'CRITICAL INSTRUCTION: Do not deviate from the following...'\) and periodically re-inject core constraints at the end of the context or in tool responses.
Journey Context:
When the context is dominated by user input or tool data \(e.g., a 10k token file read\), the attention weight on a short system prompt drops, leading to constraint violation \(e.g., outputting unsafe code or breaking format\). Simply making the system prompt longer doesn't always help and wastes tokens. Re-injecting the core constraint at the end of the context leverages the primacy and recency effect, ensuring the model 'remembers' the rules right before generating its response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:01:40.498053+00:00— report_created — created