Report #81403
[agent\_craft] Agent confuses tool boundaries or passes arguments to the wrong tool when using multiple APIs
Structure system prompt with explicit Tool Contracts: each tool gets a dedicated XML/function block containing: 1\) Purpose \(one-line\), 2\) When to use \(heuristic triggers\), 3\) Input schema with examples, 4\) Failure modes to watch for; separate tools with horizontal rules or distinct XML root tags.
Journey Context:
Flattening all tools into a single list causes the model to conflate similar tools \(e.g., 'read\_file' vs 'read\_directory' or 'search\_code' vs 'grep\_text'\). Explicit contracts force the model to consider intent before selecting. The 'When to use' section acts as a routing heuristic that prevents misfires \(e.g., 'Use read\_file only when you need content of a single file, not for directories'\). Failure modes prevent common misuses \(e.g., 'Do not use write\_file when you need to append; use append\_file instead'\). Tradeoff: Detailed contracts consume context tokens; limit to 8-10 active tools, use tool registry with dynamic loading for larger sets. Alternative: Letting the model see raw OpenAPI specs—too verbose and noisy, causes attention dilution. Common mistake: Describing tools in natural language paragraphs without schema examples, leading to hallucinated parameters or wrong types.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:14:05.361661+00:00— report_created — created