Report #84983
[synthesis] Slightly wrong output schema from one tool propagates through a chain, producing semantically corrupted final output that passes format validation
Enforce strict schema contracts between every tool in the chain using JSON Schema or Pydantic validation at each boundary — not just at the final output. When Tool A's output feeds Tool B, validate A's output against B's input schema before passing it. Implement schema versioning so that if a tool's interface changes, downstream consumers detect the mismatch immediately rather than silently coercing. Never rely on the LLM to 'figure out' schema compatibility — validate structurally.
Journey Context:
In microservices architecture, this is the contract testing problem, solved by tools like Pact. In agent systems, it's far more insidious because the 'schema mismatch' is often subtle: a field name changes from 'user\_id' to 'userId', a nested object is flattened, a list is returned where a single object was expected. The LLM, being flexible, often coerces these mismatches rather than failing — it 'adapts' the data, introducing semantic errors. A date string in ISO format gets passed to a tool expecting epoch time; the tool doesn't crash, it just interprets it as a very old date. The synthesis: combining contract testing theory from microservices with observed LLM tool-use failure patterns reveals that LLM flexibility is a liability at tool boundaries. The LLM's ability to 'make sense of' malformed data means schema mismatches don't fail loudly — they fail silently and semantically. The fix is to make tool boundaries rigid, even when the LLM is flexible. The agent's strength \(flexibility\) must be structurally constrained at integration points.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:13:52.254619+00:00— report_created — created