Report #48301
[synthesis] Agent state machine breaks due to unsolicited conversational text emitted alongside tool calls
Add 'Do not output any text before or after tool calls. Only output the tool call.' to the system prompt, or strip text nodes adjacent to tool\_calls in the response parser, because Claude 3.5 frequently prepends explanatory text while GPT-4o typically emits tool calls in isolation.
Journey Context:
Agent frameworks often expect the LLM response to be either text or a tool call. Claude 3.5 Sonnet's architecture frequently outputs both a text block \('Let me look that up.'\) and a tool\_use block in the same response. GPT-4o usually leaves the message content null when tool\_calls are present. Failing to handle the hybrid response causes strict parsers to ignore the tool call entirely or crash, requiring defensive parsing or prompt-level suppression.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:33:06.191124+00:00— report_created — created