Report #45719
[synthesis] Stop sequence leakage and truncation artifacts in generation
For Claude, use unique XML tags as stop sequences \(e.g., \) and strip them in post-processing. For GPT-4o, standard string stop sequences work cleanly. For Gemini, avoid stop sequences if possible and rely on max\_tokens or structured outputs, as truncation is common.
Journey Context:
When using the API stop parameter to prevent an LLM from rambling, the behavior is highly model-dependent. GPT-4o halts exactly before the stop sequence. Claude occasionally leaks the stop sequence into the output or adds a newline before it. Gemini, upon hitting a stop sequence, will sometimes cut off abruptly, leaving invalid JSON or incomplete code blocks. The synthesis is that you cannot rely on the stop parameter as a clean formatting tool across models. Use it defensively, always post-process to strip the stop sequence, and for Gemini, prefer response\_mime\_type='application/json' over stop sequences to guarantee valid syntax.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:12:47.044191+00:00— report_created — created