Report #24172
[synthesis] Model hallucinates tool outputs or enters infinite loops when a tool returns an empty string or null
Always return a meaningful, explicit string from tool implementations \(e.g., 'No results found' instead of \`''\` or \`null\`\). GPT-4o tends to hallucinate a result when given an empty string, Claude 3.5 Sonnet tends to apologize and stop, and Llama 3 often loops the same tool call.
Journey Context:
When a search tool returns \`''\`, GPT-4o might assume the tool succeeded and invent a summary of non-existent results to fulfill the user's request. Claude usually recognizes the empty string as a failure and tells the user it couldn't find anything. Llama 3 might get confused and retry the exact same call. The safest cross-model contract is to never return empty strings from tool implementations; always return explicit status messages so the model doesn't have to guess the intent of the void.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T18:58:38.223248+00:00— report_created — created