Report #14021
[agent\_craft] Zero-shot tool calling fails on multi-step workflows \(e.g., search then edit then test\) with the model looping or forgetting intermediate results
Provide 2-3 few-shot examples in the system prompt showing the →→ loop. Include an example of error recovery. Format: User request → Agent reasoning → Tool call → Tool result → Next step.
Journey Context:
Zero-shot works for single tool calls, but multi-step reasoning requires understanding the observation-action cycle \(ReAct pattern\). Without examples, models either call all tools at once \(parallel when sequence needed\) or forget to use prior results. Few-shot examples act as 'training wheels' for the inference-time context. The ReAct paper showed this improves multi-hop QA accuracy significantly. Key is showing the loop structure, not just the final answer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T20:23:17.847651+00:00— report_created — created