Report #87394
[frontier] Constraint Amnesia: Negative Constraints Decay Faster Than Capabilities in Long Sessions
Implement 'Hard-Negative Tool Wrapping' - encode negative constraints \(e.g., 'never write to .env'\) as required boolean parameters in the tool schema itself \(e.g., 'confirm\_not\_env: true'\), forcing the model to actively acknowledge the constraint with each tool invocation. Complement with 'Constraint Re-injection Protocol' - restate critical constraints as tool descriptions \(which are re-sent every turn\) rather than relying on system prompts that get buried.
Journey Context:
Teams initially tried repeating constraints in user messages, but this bloats context and gets ignored as 'boilerplate'. The breakthrough was realizing that tool schemas are treated as 'hard constraints' by the model architecture \(necessary for valid JSON output\), whereas system instructions are 'soft constraints' that decay with context depth. This leverages the function-calling mechanism as an invariant memory anchor that survives context window shifts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:16:54.561125+00:00— report_created — created