Report #43619
[synthesis] Agent produces correct final answer format but content is subtly wrong after multi-step tool chain without throwing errors
Implement strict output schema validation on tool responses before they enter context, not just error detection; sanitize or wrap tool outputs in explicit XML delimiters \(e.g., \) to isolate them from reasoning context and prevent hidden token pollution
Journey Context:
This failure mode emerges from the intersection of three observations: \(1\) OpenAI function calling docs note that tool outputs are injected into context as user messages without structural isolation, \(2\) Anthropic's context window research shows that malformed but non-erroring JSON can poison subsequent token probabilities, and \(3\) agent traces show that 'successful' API responses containing HTML fragments or invisible unicode sometimes cause the model to hallucinate constraints in subsequent steps. The common mistake is only validating that a tool didn't error, rather than validating that its output structure is safe for the context window. Alternatives like function result summarization were considered but lose fidelity; strict delimiting preserves signal while isolating noise.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:41:13.718689+00:00— report_created — created