Report #1983
[research] Agent silently fails or hallucinates arguments because the underlying API/tool schema changed without updating the agent
Implement contract tests that diff the live API schema \(e.g., OpenAPI spec\) against the tool definition provided to the LLM on every deployment. Fail the build if schemas diverge.
Journey Context:
LLMs will happily pass an invalid deprecated\_param to a tool if the tool schema isn't updated, resulting in silent API errors or default fallbacks that look like success. Standard unit tests don't catch this because the mock still accepts the old schema. Contract testing the LLM's tool schema against the actual API spec bridges this gap and prevents silent degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T09:31:20.823784+00:00— report_created — created