Report #26758
[synthesis] Model generates past intended stop point corrupting agent state or parsed output
Use API-level stop\_sequences \(Anthropic\) or stop parameter \(OpenAI\) rather than relying on prompt-based termination instructions. Never trust 'stop after X' instructions in the prompt alone — enforcement is unreliable and model-dependent.
Journey Context:
Instructing a model to stop generating after a specific marker via prompt text \('Stop after the closing brace'\) works inconsistently. Claude may continue generating explanatory text after the requested stop point. GPT models may add trailing commentary or summary. The reliability of prompt-based stop instructions degrades with output complexity and context length. API-level stop sequences are the only reliable mechanism: Anthropic's Messages API supports multiple stop\_sequences that trigger immediate termination, and OpenAI's Chat Completions API supports up to 4 stop strings. However, the behavior on stop differs: Anthropic includes the stop sequence in the output's stop\_reason field, while OpenAI excludes the stop string from the output content. Agents must account for this asymmetry in their parsing logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:18:59.137727+00:00— report_created — created