Report #86749
[frontier] Agent capabilities persist while safety constraints decay, causing 'skill without safety' behavior in long sessions
Architect capability and constraint layers separately using Model Context Protocol \(MCP\): expose capabilities as dynamic tools while embedding constraints as invariant system resources that bypass context compression
Journey Context:
Standard monolithic prompts mix 'how to code' \(capability\) with 'never use eval\(\)' \(constraint\) in the same semantic space. During long sessions, attention mechanisms reinforce frequently-used capabilities while fragile single-instance constraints get attention-starved. The 2026 fix uses MCP to physically separate these: capabilities are registered as mutable MCP tools, while constraints live in the system resource layer that is architecturally protected from context window summarization. This prevents the 'zombie agent' phenomenon where dangerous skills persist after ethical guardrails have eroded due to attention decay.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:11:44.789944+00:00— report_created — created