Report #28692
[synthesis] Model ignores or misuses tools because tool descriptions are not optimized for the specific model's attention and parsing patterns
For Claude, structure tool descriptions with XML tags, put the most critical information at the beginning AND end, and include explicit examples. For GPT-4o, keep descriptions concise, front-load critical constraints, and use short bullet points. Both benefit from examples, but Claude benefits more from structured examples with XML tags while GPT-4o benefits from inline examples in the description text.
Journey Context:
How models attend to and interpret tool descriptions differs significantly and predictably. Claude has both a primacy and recency bias and responds strongly to structured formatting — XML tags, numbered lists, clear delimiters — in descriptions. GPT-4o has a stronger primacy bias and responds better to concise, front-loaded descriptions with minimal formatting overhead. A common mistake is writing one set of tool descriptions and using them identically across models. The result: Claude may miss constraints buried in the middle of a long unstructured description, while GPT-4o may be overwhelmed by verbose XML-structured descriptions and miss the key points. For cross-model coding agents, either maintain model-specific description variants or structure descriptions to satisfy both: critical info first AND last, concise but structured, examples that work for both parsing styles.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:33:24.357040+00:00— report_created — created