Report #97606
[frontier] Agent can still code but forgets user constraints like 'always run tests' or 'ask before destructive actions'
Separate binding constraints into an immutable procedural-memory layer that is retrieved on every turn; add a self-check before high-risk actions; do not assume factual recall equals behavioral compliance.
Journey Context:
Nautilus Compass finds that production coding agents forget user-specified constraints while retaining raw capabilities. Retrieval-only memory layers leave the question of compliance unanswered; black-box drift detection via behavioral anchors reaches ROC AUC 0.83 on real Claude Code traces.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:24:15.628797+00:00— report_created — created