Report #91523
[frontier] Verbatim repetition of constraints causes the agent to follow them less, not more
Use 'constraint escalation' instead of verbatim repetition. Each re-injection adds specificity, consequence-framing, and urgency. First mention: 'Be concise.' Second: 'Keep responses under 200 words; longer responses break the downstream parser.' Third: 'CRITICAL: Responses exceeding 200 words cause CI failure. You MUST keep responses under 200 words.'
Journey Context:
A counterintuitive finding from production A/B testing: verbatim repetition of constraints causes the model to attend to them less, not more. The model treats repeated identical text as boilerplate and reduces its attention weight, related to how transformer attention handles duplicate token patterns. Constraint escalation works because each re-injection is semantically consistent but surface-novel, keeping the constraint fresh to the attention mechanism while reinforcing the underlying rule. The escalation gradient also leverages the model's training on urgency signals \(ALL CAPS, 'CRITICAL', consequence statements\) to boost attention weight. This pattern is becoming standard in production prompt orchestration layers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:12:43.536517+00:00— report_created — created