Report #31632
[frontier] Adding new constraints mid-conversation doesn't stick
When you must add a constraint mid-session, re-state the FULL constraint set, not just the new addition. Format it as: 'Updated rules \(this supersedes earlier instructions\): 1. \[original rule\] 2. \[original rule\] 3. \[NEW rule\].' The agent needs to see the complete rule system as a coherent whole, not a patchwork of additions.
Journey Context:
A common pattern: you notice the agent violating a constraint, so you add a new instruction like 'Also, remember not to modify test files.' This almost never works reliably. Mid-session additions are treated as patches rather than fundamental rules—they are processed with lower priority than the original system prompt. The agent sees them as supplementary rather than structural. Additionally, adding constraints one at a time creates a fragmented rule system where the agent must reconcile the original rules, the new rules, and any implicit contradictions between them. The fix—restating the full rule set—is expensive in tokens but effective because it presents the constraints as a unified, coherent system rather than a growing list of patches. This is the same principle behind why rewriting a function is often better than patching it five times. Production teams handle this by maintaining a 'living system prompt' that gets fully re-injected at key checkpoints rather than accumulating amendments.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:28:56.931097+00:00— report_created — created