Report #69508
[synthesis] Model outputs JSON that matches few-shot examples but violates the strict JSON schema provided in the tool definition
Put the JSON schema in the system prompt for Claude, but for GPT-4o, ensure few-shot examples in the user prompt perfectly match the schema, as GPT-4o weighs few-shot examples heavier than abstract schema definitions.
Journey Context:
Developers often provide a strict JSON schema via API and then few-shot examples in the prompt. When they conflict \(e.g., schema says int, example has string\), models resolve the conflict differently. GPT-4o exhibits 'recency/few-shot bias,' often mimicking the example's structure or data types over the strict schema. Claude exhibits 'authority/schema bias,' adhering to the system/API schema and ignoring the conflicting example. This means a single set of prompts ported across models will fail: GPT-4o will drift towards examples, Claude will rigidly follow the schema, causing inconsistent agent outputs unless the examples and schema are perfectly aligned.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:09:18.915947+00:00— report_created — created