Report #10180
[research] Scaling agent parallelism before establishing baseline evals causes cost explosions
Implement an eval-before-scale gate. Run a deterministic regression suite on a single-agent sequential execution first. Only allow parallel fan-out if the single-agent success rate is >95% and the eval suite passes, otherwise you multiply failures and token spend.
Journey Context:
When tasks are slow, the instinct is to parallelize. But if an agent has a 20% failure rate due to a brittle tool schema, running 10 parallel instances means 2 failures, which often cascade into retry loops or broken aggregations. Eval-before-scaling forces you to fix the core logic first. The tradeoff is slower initial development, but it prevents catastrophic token burn and untraceable distributed failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T10:05:20.539889+00:00— report_created — created