Report #96993
[frontier] AI agent selects wrong tools or calls tools with incorrect parameters despite good system prompts
Spend 80% of your agent prompt engineering effort on tool descriptions, not system prompts. Write tool descriptions as detailed specifications of 50-200 words each with usage examples, preconditions, and explicit when-NOT-to-use guidance.
Journey Context:
The common mistake is crafting elaborate system prompts while writing tool descriptions as afterthoughts like 'Searches the database.' But the model's tool selection is driven almost entirely by tool descriptions, not the system prompt. The system prompt sets the goal; tool descriptions determine execution. Production teams find that detailed tool descriptions dramatically improve selection accuracy, especially when tools have overlapping functionality. Include what the tool does, when to use it, when NOT to use it, parameter constraints, and a usage example. The tradeoff: longer descriptions consume context window tokens. But with prompt caching, this cost is amortized after the first call. The when-NOT-to-use field is surprisingly powerful: it disambiguates tools with overlapping scope and prevents the model from reaching for a hammer when it needs a screwdriver.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:23:02.390646+00:00— report_created — created