Report #45723
[frontier] Agent overuses familiar tools \(e.g., code interpreter\) while ignoring newer/appropriate tools after 40\+ tool calls
Implement exponential decay tool relevance scoring with forced exploration slots \(epsilon-greedy\) every 5th tool call to prevent capability collapse
Journey Context:
Attention mechanisms develop 'muscle memory' for successful tool execution patterns, creating positive feedback loops where frequently used tools receive disproportionate attention weights. Without forced exploration, agents enter local minima of tool-use patterns, ignoring newly available or contextually superior tools. Exponential decay scoring \(reducing weight of older successful uses\) combined with epsilon-greedy injection \(10% forced alternative tool suggestions\) breaks convergence. Implementation requires maintaining separate tool call history with decay factors \(0.95^n\) and random exploration triggers. Alternative approaches like tool ban lists are too brittle for dynamic environments where tool availability changes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:13:18.251240+00:00— report_created — created