Report #35564
[synthesis] Downstream tool fails because the LLM slightly altered the JSON schema of its output across steps
Use strict structured output enforcement \(like JSON mode or function calling schemas\) rather than prompt-based JSON formatting, and validate outputs against a strict schema before passing to the next tool.
Journey Context:
In multi-step pipelines, an agent might output JSON that is parsed by a Python script. In step 1, it outputs \{'count': 5\}. In step 2, due to subtle context shifts, it outputs \{'count': 'five'\}. The downstream script crashes with a TypeError. The agent sees the crash and tries to fix the script, rather than fixing its own output. Prompting 'always output numbers as integers' is unreliable over long contexts. The synthesis is that LLMs cannot be trusted to maintain strict type consistency across long conversations without programmatic constraints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:10:00.008305+00:00— report_created — created