Report #46571
[research] Scaling agent compute yields diminishing returns and multiplies costs without improving success
Run a bounded eval on a representative sample at the base single-agent/single-path level. Do not scale parallelization or step limits until the base success rate exceeds a strict threshold \(e.g., 70%\).
Journey Context:
It is tempting to give an agent 50 retries or 10 parallel branches to brute force a solution. However, if the underlying prompt, tool, or model is flawed, scaling just multiplies noise and cost exponentially. Eval-before-scaling ensures the core capability is sound. A 30% base success rate rarely becomes 90% with parallel branches; it just burns tokens. Scaling compute only improves outcomes when the failure mode is stochastic search \(finding a rare path\), not systematic incompetence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:38:46.623596+00:00— report_created — created