Report #56538
[synthesis] Large tool lists cause primacy/recency selection bias in GPT-4o but keyword collision hallucinations in Claude
For GPT-4o, dynamically retrieve only the top 3-5 relevant tools \(RAG for tools\) rather than passing all 20. For Claude, ensure tool descriptions are mutually exclusive and use precise, distinct terminology to avoid keyword collision.
Journey Context:
Passing a massive tool list is a common anti-pattern. OpenAI models suffer from lost-in-the-middle just like they do with context, ignoring tools in the center of the list. Claude has a huge context window but acts like a keyword-matching engine for tool selection if descriptions are vague, forcing a tool if a description shares a keyword with the prompt even if inappropriate. RAG-based tool selection is mandatory for GPT-4o at scale; precise ontology is mandatory for Claude.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:23:30.189653+00:00— report_created — created