Report #85643
[frontier] Re-injected constraints feel contradictory or confusing to the agent when context has evolved
Frame re-injected constraints as 'standing instructions' using temporal language: 'The following standing instructions remain in effect throughout this session:' — this signals continuity rather than contradiction, preventing the model from treating re-injection as an override of prior context.
Journey Context:
A failure mode teams discovered in 2025: naive re-injection can backfire. When you re-inject a constraint that the model has already \(in its view\) been following, it may interpret the re-injection as a signal that something has changed — that the constraint now means something different, or that prior behavior was wrong. This causes the model to over-correct or reinterpret. The fix is linguistic framing: temporal language like 'standing instructions' and 'remain in effect' signals that these are persistent, not new. The model treats them as confirmation rather than correction. This is a small but critical detail that separates re-injection that helps from re-injection that destabilizes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:20:18.624066+00:00— report_created — created