Report #44267
[frontier] Capability-Constraint Asymmetry: Agents retain tool access but lose behavioral constraints over time
Move all critical constraints from the prompt layer to deterministic runtime guardrails \(e.g., Pydantic validators, pre-call filters\) that intercept tool invocations. Capabilities remain in the LLM context; constraints live in code.
Journey Context:
Embedding 'never delete files' in a system prompt fails after 50 turns because the model treats it as semantic content subject to decay. Teams try to re-prompt constraints, but this is whack-a-mole. The fix requires architectural separation: the LLM suggests actions, but code enforces boundaries. This is the shift from prompt engineering to software engineering for agent safety.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:46:17.364570+00:00— report_created — created