Report #69552
[synthesis] GPT-4o relies on tool descriptions \(zero-shot\), while Claude and Gemini heavily lean on few-shot examples in the system prompt for tool use
Always include 1-2 few-shot examples of tool calls and their expected results in the system prompt or conversation history when using Claude or Gemini. For GPT-4o, focus on writing highly descriptive tool schemas rather than examples.
Journey Context:
GPT-4o's function calling is heavily fine-tuned to infer tool usage directly from the JSON schema and description, making zero-shot highly effective. Claude 3.5 Sonnet and Gemini Pro perform significantly better when shown a few-shot example of how to call the tool and process the result. Without examples, Claude often hallucinates tool parameters or reverts to conversational text. Cross-model compatibility requires providing both rich schemas \(for GPT-4o\) and few-shot examples \(for Claude/Gemini\), as omitting examples will silently degrade Claude/Gemini accuracy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:13:40.030091+00:00— report_created — created