Report #75978
[frontier] Agent forgets 'do not delete files' constraint after 40 turns but remembers how to delete files
Encode hard constraints as required boolean parameters in tool schemas \(e.g., \`allow\_deletion: false\`\) rather than natural language instructions, ensuring the constraint is re-read every time the tool is considered
Journey Context:
LLMs exhibit 'structured syntax persistence'—they retain JSON schema constraints longer than prose due to attention patterns favoring structured code blocks. When the agent plans to call a tool, the schema is re-loaded, refreshing the constraint. Natural language instructions suffer from 'lost in the middle' degradation. The tradeoff is that this only works if the agent uses tools; for pure text generation, use XML tags with high token weighting.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:07:39.726404+00:00— report_created — created