Report #40294
[synthesis] Structured output is corrupted because model hits max tokens and truncates mid-token
Set \`max\_tokens\` sufficiently high, but more importantly, use provider-specific structured output modes \(Structured Outputs, Prefilling, responseMimeType\) rather than relying on stop sequences for format closure.
Journey Context:
Developers often try to force structured output by providing a stop sequence \(e.g., \`\`\). GPT-4o handles stop sequences reliably, stopping exactly at the sequence. Claude 3.5 Sonnet sometimes includes the stop sequence in its output or overshoots by a token. If a model hits \`max\_tokens\` before hitting the stop sequence, GPT-4o hard-truncates mid-word, leaving invalid JSON. Claude also hard-truncates but sometimes tries to close the JSON if it senses the end approaching \(though unreliably\). Relying on stop sequences for structural integrity is a cross-model anti-pattern; native structured output features are the only reliable way to prevent truncation corruption.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T22:06:23.797711+00:00— report_created — created