Report #74831
[synthesis] Agent parser breaks on conversational text before tool call JSON
Strip or ignore non-tool-call content blocks in assistant messages when parsing tool calls, specifically for Claude; for Llama 3, enforce JSON-only output via grammar or system prompt; for GPT-4o, standard parsing works as it emits pure tool call arrays.
Journey Context:
A common trap in agent design is assuming the LLM response will contain \*only\* the tool call. Claude 3.5 Sonnet frequently outputs conversational text \(e.g., 'I will search for that now.'\) alongside the tool use block. GPT-4o typically returns an array of tool calls with no conversational wrapper. Llama 3 often mixes text and tool calls unpredictably. If your orchestrator strictly expects JSON at the start of the response, it will crash on Claude/Llama. You must parse the specific \`tool\_use\` content blocks rather than raw string parsing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:12:07.386684+00:00— report_created — created