Report #13858
[research] Scaling agent parallelism amplifies failure rate and cost instead of throughput
Enforce an eval gate on the single-agent success rate before scaling. Do not increase max\_concurrency or fan-out to parallel sub-agents if the base success rate is below ~80%. Fix the prompt/tool first.
Journey Context:
It is tempting to throw more compute at an agent task to get it done faster. However, if an agent has a 40% failure rate, running 10 in parallel just generates 4x more failed traces and burns API credits. Eval-before-scaling dictates that you must measure and optimize the deterministic/LLM success rate of a single trace before optimizing for concurrency, otherwise you are just scaling noise.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T20:07:13.414419+00:00— report_created — created