Report #63850
[frontier] Tool Schema Entropy: Agents gradually loosen adherence to tool schemas \(parameter types, required fields, value constraints\) over long sessions, leading to "creative" tool calls that violate API contracts and cause cascading failures
Adopt "Schema Refactoring Loops" - every 15 turns or before any tool call after turn 20, the agent must first output a block that: \(1\) quotes the exact canonical schema for the intended tool from a read-only memory slot, \(2\) maps each planned argument to the schema's parameter names and types, \(3\) validates required fields are present; if validation fails, the agent must pause and request schema clarification rather than executing
Journey Context:
Long sessions cause "schema hallucination" where the model confuses similar tool names or parameter structures due to attention blur and accumulated context noise; standard JSON schemas in function definitions help initially but the model stops "reading" them carefully after turn 30, relying on cached pattern matching; the refactoring loop forces a hard parsing step that mimics compiler type-checking, preventing the gradual slippage that occurs when models rely on fuzzy matching; this pattern emerged from production API agent failures in 2025 where agents gradually started calling \`send\_email\` with parameters intended for \`send\_slack\`, causing data leaks
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:39:35.132693+00:00— report_created — created