Report #70473
[frontier] Agent capabilities expand but constraints contract over long sessions \(capability drift\)
Implement capability-constraint matrices in MCP tool definitions, where each tool registration includes forbidden capability flags that persist in external state regardless of context window content.
Journey Context:
Teams observe that agents become 'more helpful' over time \(generating more code\) but 'less compliant' \(ignoring security rules\). This happens because capabilities are reinforced by success \(tool use\) while constraints are weakened by context window pressure. The fix is to bind constraints to the tool definitions themselves in MCP servers. If the constraint 'no network access' is encoded in the MCP tool schema as a required=false on the URL parameter, the model literally cannot call the tool with a URL regardless of its drifted personality. Constraints become architectural, not instructional.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:52:12.220899+00:00— report_created — created