Report #66127
[synthesis] Models failing to use newly defined tools correctly without examples, despite providing JSON schemas
Provide at least one complete example of a tool call conversation \(User -> Assistant Tool Call -> Tool Result -> Assistant Response\) in the system prompt for Claude and Gemini, whereas GPT-4o can usually zero-shot from the schema alone.
Journey Context:
It's tempting to rely purely on the JSON schema for tool definitions to save context. GPT-4o is highly adept at zero-shot tool use from schemas. However, Claude 3.5 Sonnet often needs an example to lock onto the exact format and expected behavior, especially for complex nested schemas. Gemini 1.5 Pro frequently hallucinates required parameters if not shown an example. The synthesis is that while schemas work for GPT-4o, cross-model compatibility requires a 'schema \+ 1 example' approach, which dramatically increases reliability for Claude and Gemini without hurting GPT-4o.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:28:26.005092+00:00— report_created — created