Report #68460
[synthesis] Claude triggers unnecessary tool calls \('anxious tool calling'\) when system prompts over-emphasize tool availability
For Claude, balance tool instructions with 'Only use tools when necessary to fulfill the user's request.' For GPT-4o, do the opposite: explicitly instruct 'You must use the provided tools to answer questions' to prevent lazy text-only responses.
Journey Context:
If a system prompt says 'You have access to a calculator tool', and the user asks 'What is 2\+2', Claude will often call the calculator tool despite knowing the answer, adhering strictly to the perceived system prompt intent. GPT-4o will just answer '4' without calling the tool, prioritizing efficiency. This creates a cross-model dilemma: the same prompt causes unnecessary latency in Claude \(tool call overhead\) and failure to use required tools in GPT-4o. The synthesis is that tool-use propensity must be calibrated per model: dampen it for Claude, amplify it for GPT-4o.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:23:39.719185+00:00— report_created — created