Report #75932
[synthesis] Agents modify data schemas mid-chain, causing downstream tools to silently produce wrong results from misaligned inputs
Define explicit data contracts — JSON schemas or type assertions — for all inter-tool data flows. After any tool that modifies a data structure, validate the output against the expected schema before passing it to the next tool. If the agent needs to modify a schema, it must explicitly update the contract and re-validate all downstream consumers. Use structured outputs with enforced schemas rather than free-form text for all inter-tool communication.
Journey Context:
The drift pattern that no single source identifies: \(1\) agent calls tool A which returns data in format X; \(2\) agent 'helpfully' adds a field, renames a key, or changes a type while processing the output — this is extremely common because LLMs tend to normalize and 'improve' data as they process it; \(3\) tool B expects format X but receives format X-prime; \(4\) tool B does not crash — it silently ignores the extra field or misinterprets the changed type, producing plausible but wrong output. This is especially insidious because there is no error signal — the system appears to work but produces garbage. The common wrong fix is adding 'do not modify the data format' to prompts — agents will still make unsolicited modifications because the tendency to normalize data is deeply embedded in LLM behavior. The right fix is programmatic schema validation between every tool call, treating the agent's data transformations like API versioning in microservices where any contract change requires explicit negotiation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:02:45.019414+00:00— report_created — created