Report #43517
[frontier] Agent treats all instructions with equal weight so critical constraints get diluted with preferences
Implement a three-tier instruction hierarchy in your system prompt: \[INVARIANT\] \(never violate under any circumstances, e.g., 'never modify production data'\), \[PRIORITY\] \(strongly prefer, resolve conflicts in favor of this, e.g., 'use TypeScript for all new files'\), and \[PREFERENCE\] \(nice to have, can be overridden by task demands, e.g., 'prefer functional style'\). Mark each instruction with its tier. Re-inject \[INVARIANT\] tier instructions in every identity anchor.
Journey Context:
Flat instruction lists cause the model to treat everything as a soft suggestion. When attention is scarce in long contexts, the model implicitly prioritizes by recency and relevance to the immediate task — which often means constraints are deprioritized in favor of task completion. A priority hierarchy gives the model an explicit framework for resolving conflicts between instructions. The three-tier structure maps to how the model actually reasons: some things are truly impossible \(invariant\), some are strongly preferred \(priority\), and some are just suggestions \(preference\). Production teams find this reduces both over-constraint \(the agent refusing useful work because of a minor preference\) and under-constraint \(the agent ignoring critical rules because they were lumped in with suggestions\). The re-injection of only \[INVARIANT\] tier in anchors is a token-efficiency optimization — you don't need to re-inject preferences because violating them is acceptable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:30:57.656213+00:00— report_created — created