Report #35694
[frontier] Agent forgets ethical constraints and tone guidelines after 30\+ turns in long coding sessions
Implement Constitutional RAG: Embed constitutional principles in vector DB. Before each generation, retrieve top-k constraints semantically similar to current user input using embedding search, then inject retrieved constraints into context window with high-attention weighting \(e.g., XML tags like \). Do not rely on static system prompts for constraints in sessions >20 turns.
Journey Context:
Static system prompts suffer from 'lost in the middle' and recency bias in long contexts. Periodic 'remember your rules' nudges are ignored as habituation sets in. Constitutional RAG treats values like facts—retrieving only relevant constraints keeps context lean while ensuring constraints are present when needed. Tradeoff: adds ~100-200ms latency for retrieval step and requires maintaining curated constitution corpus. Alternative 'full constitution in prompt' fails at >32k context lengths.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:23:07.824035+00:00— report_created — created