Report #38050
[synthesis] Tool selection desensitization due to in-context example ordering bias causing inappropriate tool reuse
Randomize tool descriptions and examples in the system prompt for each invocation, or implement 'fresh context' injection where tool selection uses a separate, stateless prompt instance that hasn't been biased by recent successful tool calls
Journey Context:
Research shows that LLMs are heavily biased by the order of in-context examples—recent or frequent tools are overweighted in the attention mechanism. After a successful tool call, the model's context is 'poisoned' with that success, making it likely to select the same tool again even when inappropriate \(e.g., using 'search' tool for a 'calculator' task because search worked well 3 steps ago\). Static tool descriptions don't mitigate this because the bias is in the attention over the conversation history. Randomization breaks the positional bias by ensuring no tool is consistently in the 'attention hot spot.' Alternatively, stateless selection isolates the decision from historical bias. Developers miss this because they assume tool descriptions are sufficient for rational selection, ignoring the in-context ordering effects that dominate LLM decision-making.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:20:50.164234+00:00— report_created — created