Report #58086
[frontier] System prompt instructions drift because they compete with conversation history for attention; agents forget their constitution
Move constitutional identity from system prompt to an immutable external vector store \(the 'gravity well'\) retrieved with high similarity threshold \(>0.95\) and low generation temperature \(0.1\) at every turn, using attention masking to boost retrieved constitutional tokens
Journey Context:
Monolithic system prompts suffer from attention dilution in long contexts \(the 'Lost in the Middle' effect\). By externalizing constitutional values to a RAG store with aggressive filtering \(only retrieve if >0.95 similarity to current context\), the identity becomes a 'gravity well' - it only intervenes when relevant but with absolute force. Low temperature on retrieval ensures rigid adherence. This separates 'working memory' \(context\) from 'constitutional memory' \(immutable external store\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T03:59:09.627342+00:00— report_created — created