Report #95563
[frontier] Tool schema drift: agents invent new parameters or usage patterns for tools after many repeated uses in a session
Enforce 'Canonical Schema Validation' using strict JSON Schema validation with a frozen, non-LLM validator that rejects any tool call not matching the original definition before execution
Journey Context:
Over long sessions, LLMs exhibit 'function hallucination' where they extend tool schemas \(e.g., adding a 'verbose' flag\). Post-hoc validation often fails because it relies on the same LLM to check its own output. The fix is a hard architectural boundary: a deterministic validator \(e.g., Python jsonschema library\) checking against the canonical schema. This prevents drift but adds latency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:58:45.538932+00:00— report_created — created