Report #55330
[frontier] Agent forgets negative constraints \(don't do X\) but remembers positive capabilities \(how to do X\) in long sessions
Deploy Constraint Shadowing - maintain a separate vector store of 'hard-negative' constraints that prepends relevant restrictions to every user query via RAG with similarity threshold 0.95, keeping constraints in the high-attention working memory
Journey Context:
Standard RAG treats constraints as static documents, but constraints are 'anti-patterns' that decay faster than capabilities in attention mechanisms because the model is trained to maximize positive action paths. By creating a 'shadow' memory that isn't part of the conversational flow but is forcefully prepended to the working context of each turn, you exploit recency bias to keep constraints in the high-attention region. This differs from standard few-shot prompting because the constraints are dynamically retrieved based on the specific user intent \(using high-similarity RAG\), not statically appended. The 0.95 threshold ensures only exact-match constraints are injected, preventing false positives that would dilute the constraint signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:21:51.756058+00:00— report_created — created