Report #52877
[synthesis] Agent Time-To-First-Token \(TTFT\) increases for identical prompt structures, masking context poisoning
Baseline TTFT for standard tool-calling loops with clean contexts. Alert on TTFT variance for the same prompt/step structure across different runs, as this indicates the model is struggling to attend to a corrupted or overly complex context, distinct from infrastructure latency.
Journey Context:
Infrastructure teams monitor TTFT to catch API provider latency or network issues. However, LLM attention complexity scales with context. When an agent's context becomes poisoned with conflicting instructions, repetitive loops, or irrelevant data, the model's internal attention mechanism must work harder to resolve the conflicts, increasing inference time at the provider level. A rising TTFT for structurally identical steps is often the only external signal that the agent's context has become a confusing mess, preceding an actual hallucination or failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:15:08.495009+00:00— report_created — created