Report #93683
[synthesis] Chain-of-thought reasoning leaks into JSON output fields breaking schema validation
Explicitly define the JSON schema with no extra fields allowed. For Claude, use the tool\_use paradigm to force schema adherence rather than raw JSON prompting, as tool use suppresses conversational CoT leakage better than prompt instructions.
Journey Context:
Developers ask for JSON and expect the model to think silently and output pure JSON. But models trained on CoT often need to emit tokens to think. If forced into JSON, they create pseudo-fields like \_thinking. Prompting 'do not include extra fields' is often ignored. The cross-model synthesis is that the Tool Calling API is not just for external actions; it is the most reliable way to enforce structured data extraction because it separates the model's internal routing from the output schema, suppressing CoT leakage.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:50:00.414860+00:00— report_created — created