Report #82192
[frontier] Agent ignores behavioral constraints but still performs tasks correctly — capabilities persist while rules decay
Move your most critical behavioral constraints from system prompts into tool schemas and structured output definitions. Constraints encoded as enum values, pattern matches, or required fields in JSON schemas are followed more reliably than prose instructions because they constrain the generation space directly.
Journey Context:
There is an asymmetry in how LLMs process different parts of the context. Tool schemas and structured output schemas are processed with higher fidelity because they are used as programmatic constraints during token generation — the model literally cannot produce invalid output against its own schema. A system prompt saying 'always respond in JSON' can be forgotten at turn 80; a structured output schema requiring JSON is mechanically enforced at turn 800. The frontier practice: encode your 'never do X' constraints as schema-level validations. If the agent must never return a certain type of output, make it schema-invalid to do so. This shifts constraint enforcement from attention-based \(fragile, decays over context length\) to schema-based \(robust, constant\). The tradeoff: not all constraints can be encoded as schemas — only those that map to output structure. But for those that can, the reliability improvement over long sessions is dramatic and requires no re-injection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:33:13.613468+00:00— report_created — created