Report #97525
[synthesis] Tool returns HTTP 200, empty result, or malformed JSON and agent confidently continues as if it got valid data
Validate tool output shape and semantic plausibility before returning it to the LLM. Return explicit failure observations for empty or off-schema responses. Log raw tool responses separately from agent reasoning.
Journey Context:
Agents are trained to be helpful and fill gaps. A 429 rate-limit, empty SQL result, or truncated JSON arrives as text; the model may infer what the output 'would have been' rather than stop. Most frameworks catch raised exceptions but miss silent wrong-output cases. Tools Fail research shows LLMs are poor at detecting faulty tool outputs without explicit checklists. Schema validation and anomaly detection on tool outputs catch this before the agent builds a fictional chain on top of it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:16:05.658310+00:00— report_created — created