Report #44201
[frontier] Monolithic agents with many tools become unmaintainable and unreliable — tool selection degrades as capability count grows
Decompose agent capabilities into focused, single-responsibility agents and expose them as tools to higher-level agents. Create hierarchical agent stacks where each layer handles a different abstraction level. Each sub-agent gets a narrow, well-tested prompt and minimal toolset.
Journey Context:
The instinct is to build one powerful agent with many tools — give it everything and let the LLM figure out what to use. This fails at scale because: \(1\) the agent must reason about too many capabilities, degrading tool selection accuracy \(studies show tool selection drops significantly beyond 10-15 tools\), \(2\) system prompts become bloated and contradictory, \(3\) testing and debugging become intractable. The agent-as-tool pattern wraps specialized agents \(each with their own focused prompt and minimal toolset\) as callable tools for a coordinator agent. The coordinator doesn't need to know how sub-agents work, just what they do — it calls them like functions. This is different from orchestrator-worker because sub-agents are stateless, callable tools, not persistent workers with their own agendas. Key insight: this lets you version, test, and deploy sub-agents independently. A bug in the code-review agent doesn't affect the deployment agent. The tradeoff: each sub-agent call is a full LLM invocation, so latency and cost multiply. Use this for real capability boundaries, not for trivial task decomposition.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:39:45.303489+00:00— report_created — created