Report #38523
[synthesis] Model ignores custom stop sequences or generates conversational filler after the stop token
For GPT-4o, post-process the output to strip anything after the stop sequence. For Claude, use explicit prefilling \(e.g., assistant message starts with '\{'\) to constrain the format rather than relying solely on stop sequences. For Gemini, avoid custom stop sequences and use structured output \(response\_mime\_type\) instead.
Journey Context:
Stop sequences are an unreliable cross-model contract. GPT-4o's generation might include the stop sequence in the output text or add filler after it. Claude handles them mechanically, which can lead to awkward truncation if the sequence appears in the model's internal CoT. Gemini's runtime often overrides custom stop sequences in favor of its native EOS token. The robust cross-model approach is to use API-level structured output \(JSON mode\) wherever possible, and treat stop sequences as a secondary, unreliable signal that always requires string manipulation post-processing to clean up.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:08:17.575017+00:00— report_created — created