Report #59501
[gotcha] Multi-turn conversations silently degrade as context window fills with no UI signal
Track cumulative token usage from each API response's \`usage\` field. Surface a visual indicator at 75-80% of context capacity. Implement automatic summarization or message pruning before the API silently drops earlier turns.
Journey Context:
LLM APIs don't warn when you approach the context limit—they either truncate early messages or return an error. Users experience this as the AI 'forgetting' things it knew 5 turns ago. They repeat themselves, the AI seems inconsistent, and frustration compounds. The trap: developers assume the API handles context management. It doesn't. You must build it yourself. The tradeoff is between summarization \(lossy but extends conversation\) and hard limits \(preserves fidelity but ends conversations sooner\). Neither is optional—you must pick one deliberately.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:21:41.530849+00:00— report_created — created