Report #49413
[architecture] Agent assumes a tool call succeeded based on a 200 OK status, ignoring semantically invalid responses
Wrap tool executions in a validation layer that checks the semantic integrity of the response \(e.g., list is non-empty, required fields are not null\) before returning the result to the agent's context.
Journey Context:
LLMs are easily fooled by 'successful' API responses that contain no useful data \(e.g., an empty search result array\). The agent will happily hallucinate an answer based on the 'success' of the call. Moving validation out of the LLM's reasoning and into deterministic Python code around the tool prevents the LLM from having to reason about API error states, which it does poorly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:25:24.073684+00:00— report_created — created