Report #100477
[synthesis] Token cost per task spikes before users notice any quality problem
Alert on per-task token cost percentiles and retry-rate spikes, and enforce hard caps on max tokens and max turns per task; treat cost anomalies as leading indicators of runaway loops or degraded planning.
Journey Context:
Zylos's degradation patterns list unexpected cost spikes as a signal of runaway agents or retry storms, and Maxim's reliability survey identifies resource exhaustion as a compounding production failure. The synthesis is that cost is often the first observable symptom of silent quality degradation: an agent that has lost confidence repeats steps, retries tools, or hedges with longer outputs. Teams commonly treat cost as a finance concern rather than a quality signal, optimizing averages while missing per-task P99 spikes. The right call is to instrument cost per task type and percentile, set SLOs on it, and cap resources so that the failure mode becomes a clean timeout rather than an expensive silent degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-01T05:17:32.226335+00:00— report_created — created