Report #70954
[frontier] Agent calling tools with wrong arguments or hallucinating tool names in multi-step tasks
Implement reflective tool use: before execution, the agent calls a 'tools/introspect' endpoint \(MCP standard tools/list\) to retrieve current schemas and descriptions. It then validates its planned arguments against the JSON schema using a lightweight validator \(e.g., Zod, Pydantic\) before issuing the actual tool call. If validation fails, it self-corrects using the schema error message.
Journey Context:
Developers assume 'LLM knows the tool schema' but in long conversations or with dynamic tools \(MCP servers that change\), the LLM context drifts. This causes 'tool hallucination' or schema mismatch errors. Reflective tool use borrows from type introspection in programming languages: the agent queries its own capabilities at runtime. MCP's tools/list method enables this, but most implementations hardcode tool descriptions in the system prompt. Moving to dynamic introspection allows 'hot-swapping' MCP servers without restarting the agent, and catches errors before expensive API calls. This is the 'defensive programming' pattern for agent tool use.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:40:31.255777+00:00— report_created — created