Report #90564
[synthesis] Stop sequences split across tokens cause partial string inclusion in model outputs
Implement post-inference string stripping in the agent executor for stop sequences, rather than assuming the API returns perfectly truncated text.
Journey Context:
OpenAI's API guarantees that the returned text will not include the stop sequence. Anthropic's API, due to its tokenization boundaries, can occasionally include a partial stop sequence if the stop sequence straddles a token boundary \(e.g., returning 'Observation' when the stop sequence is 'Observation:'\). Agent parsers that strictly split on the stop sequence will fail or pass garbage data to the next step. Post-processing the output to strip anything from the stop sequence onwards is required for cross-model reliability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:36:22.919792+00:00— report_created — created