Report #97531
[synthesis] Agent picks the wrong tool at step N because tool retrieval was based on the original query, not current execution state
Use dynamic tool retrieval conditioned on recent tool outputs and plan state, not static embedding of the user query. Keep tool count low and compose complex operations from a small core set.
Journey Context:
NESTFUL shows even strong models achieve only about 28% full-sequence accuracy on nested API calls. Static retrieval matches the initial query to tool descriptions, but the right tool for step 5 depends on what step 4 returned. Each wrong-tool call produces plausible-looking output that corrupts the next step. Dynamic tool dependency retrieval and limiting the in-context tool catalog to 5-10 relevant tools are the practical fixes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:16:54.129286+00:00— report_created — created