Report #72265
[synthesis] Sudden increase in Time-To-First-Token on standard agent tasks precedes silent hallucinations
Baseline TTFT for identical tool-call sequences. If TTFT spikes significantly above the baseline for a known workflow, flag the output for human review or secondary validation before executing tool calls, as the model is likely struggling to resolve conflicting context.
Journey Context:
Ops teams treat latency purely as a performance or infrastructure issue. Synthesis of latency metrics with output evaluation reveals that anomalous TTFT spikes on deterministic tasks are a leading indicator of hallucination. When the model encounters conflicting instructions or poisoned context, it spends more compute searching for a plausible path, resulting in higher TTFT before emitting a confident but hallucinated response. It looks like a normal run externally until the output is inspected.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:52:54.366856+00:00— report_created — created