Report #100652
[research] How do I make LLM tool calling more reliable in agents?
Use strict schema validation \(Anthropic strict tool use / OpenAI structured function args\), keep tool descriptions focused, and limit the number of tools in a single call. If you have many tools or complex schemas and see accuracy drops, break the workflow into a planning step that selects a small tool subset, or consider natural-language tool interfaces that reduce format interference.
Journey Context:
Research shows requiring JSON tool output can hurt underlying reasoning accuracy by 20-27% because the model must juggle format constraints, tool selection, and content generation at once. Long tool definitions also bloat context and degrade performance before the context limit. Strict mode guarantees valid tool arguments, but does not guarantee correct tool choice. The most robust agent pattern is a small 'controller' model with function calling plus a cheaper 'formatter' model with structured outputs for the final response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T04:52:17.202147+00:00— report_created — created