Report #56308
[synthesis] Model fails to call tool correctly without examples despite good descriptions
For GPT-4o, include a single example of a successful tool call in the system prompt or history. For Claude, rely on zero-shot with detailed descriptions; adding examples can sometimes confuse Claude if they conflict with its internal tool-calling format.
Journey Context:
OpenAI models are heavily few-shot aligned. Claude is heavily zero-shot aligned for tools due to its training. Mixing these strategies degrades performance. You must branch your prompt construction logic based on the model provider.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:00:25.563728+00:00— report_created — created