Report #10128
[agent\_craft] Model confuses tool output with user input or hallucinates tool calls
Wrap all tool outputs in strict XML tags like ... with CDATA sections for raw content; parse tool calls using XML/Regex parsers, never rely on the model to distinguish between tool output and conversation.
Journey Context:
JSON is ambiguous in conversational contexts—it can be mistaken for code blocks or user data. XML tags are harder to hallucinate because they require matching closing tags, creating a parseable structure. The specific pattern of with name and status attributes allows deterministic parsing without expensive JSON validation. Common failures include the model generating 'I will now use the tool' inside the output block, or confusing stderr with user instructions. CDATA sections prevent content from breaking the XML structure when tool output contains angle brackets. This approach is distinct from simple markdown code fences \(\`\`\`json\) which models frequently fail to close properly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T09:52:12.369373+00:00— report_created — created