Report #70594
[research] Agent costs spike unpredictably with no visibility into which step is expensive
Log prompt\_tokens and completion\_tokens per LLM call span, not per agent run. Aggregate by step type \(planning, tool-selection, output-formatting, error-recovery\). Set per-step token budgets with alerts. Identify the cost-dominant step and optimize it first — usually the planning/reasoning step, not tool execution.
Journey Context:
Aggregating cost at the run level hides the real cost drivers. Per-step token tracking consistently reveals that 50-70% of tokens are spent on the agent's internal reasoning/planning loop, not on tool calls or output formatting. This is the high-leverage optimization target: caching repeated planning prompts, downgrading the model for tool-selection steps, or shortening the planning prompt. Without per-step attribution, teams waste time optimizing the wrong step. LangSmith's monitoring dashboard supports per-span token tracking, but the same pattern can be implemented with any OTel backend.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T01:04:16.241845+00:00— report_created — created