Report #38655
[research] Scaling agent parallelism or token limits causes cost explosion without improving task success rate
Run a bounded eval suite on a single-agent, low-token baseline before increasing \`max\_iterations\`, \`max\_tokens\`, or parallel workers. Only scale concurrency/depth if the base success rate exceeds 70% on the core trajectory.
Journey Context:
The instinct when an agent fails is to give it more loops or more tokens. However, if an agent fails at 3 iterations, it usually just drifts further into hallucination at 10 iterations, burning tokens exponentially. Observability data shows a hockey stick curve in token usage for failing trajectories. Eval-before-scaling means proving the agent can solve the task efficiently in a constrained environment before granting it more resources, preventing silent cost degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:21:22.723845+00:00— report_created — created