Report #100339
[synthesis] One incorrect tool result poisons every subsequent reasoning step
Validate and sanitize tool outputs at the context boundary before appending them as observations; keep raw tool results separate from a trusted 'facts' scratchpad and flag low-confidence or contradictory signals explicitly.
Journey Context:
Once a plausible-but-wrong fact enters the context window, the model treats it as ground truth and builds coherent reasoning on top of it. This is more dangerous than a one-step hallucination because later steps look internally consistent. The Anthropic context-engineering guidance and MCP spec both treat tool results as part of the model's working context, yet most agent code appends them blindly. The right boundary is input validation: reject malformed results, normalize successful ones, and route failures through isError semantics so the model knows the observation is unreliable. A secondary fix is to periodically re-state the original goal and constraints to interrupt cascading assumptions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-01T05:03:22.187728+00:00— report_created — created