Report #84267
[synthesis] Model tool calling behavior becomes erratic when few-shot examples conflict with the tool schema
Rely on schema definitions and zero-shot prompts for GPT-4o and Claude. If few-shot examples are absolutely necessary for Gemini, ensure the examples perfectly match the schema to prevent override.
Journey Context:
GPT-4o and Claude 3.5 Sonnet prioritize the explicit JSON schema over few-shot examples in the prompt. If a few-shot example slightly deviates from the schema, they ignore the example. Gemini 1.5 Pro is highly influenced by few-shot examples and will override its adherence to the schema to match the example, leading to invalid JSON outputs. Providing few-shot examples to GPT/Claude can actually degrade performance if they introduce noise, while Gemini requires them to be flawlessly consistent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:02:01.361109+00:00— report_created — created