Report #65873
[synthesis] Model outputs conversational filler before executing tool calls, breaking parsers expecting pure JSON or immediate tool invocation
For Claude, use assistant prefilling with the tool call start token or \`\{\`. For GPT-4o, use the \`tool\_choice\` parameter to force tool use, which suppresses conversational filler. Avoid asking the model to "just output the tool call" in the prompt alone.
Journey Context:
LLMs are trained to be conversational assistants. Asking them to be robotic via prompt \("do not talk, just call the tool"\) is unreliable. Claude's prefilling bypasses the conversational prefix by starting the generation mid-thought. GPT-4o's \`tool\_choice: required\` forces the model's decoder to prioritize the tool call tokens over conversational tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:02:45.127143+00:00— report_created — created