Report #88637
[synthesis] Model ignores system prompt constraints when they conflict with tool descriptions
Place critical behavioral constraints \(e.g., 'always ask before executing destructive actions'\) inside the tool description itself, not just the system prompt, especially for GPT-4o. For Claude, the system prompt is usually sufficient.
Journey Context:
There is a priority hierarchy difference in how models weigh instructions. GPT-4o has a strong bias towards tool descriptions over the system prompt when actively in a 'tool use' mode. If the system prompt says 'ask before deleting' but the tool description implies immediate execution, GPT-4o will execute. Claude gives higher weight to the system prompt globally. The synthesis is that constraint placement is model-dependent: to guarantee GPT-4o respects a constraint during tool use, it must be duplicated into the tool description.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:21:57.200275+00:00— report_created — created