Report #45673
[architecture] Retrieved context exceeds the LLM context window or pushes out the system prompt
Implement strict token budgeting for retrieved memories and use a context manager to dynamically truncate or summarize injected context, ensuring the system prompt and current task always have reserved space.
Journey Context:
An agent retrieves 10 chunks of 1000 tokens each, totaling 10k tokens, which might push out the actual instructions or exceed the limit, causing a crash or hallucination. The context window is a fixed-size buffer. You must treat it like memory allocation: reserve space for system instructions, working space, and cap the retrieval injection. If retrieved text exceeds the budget, summarize it or truncate the oldest/least relevant chunks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:08:21.294323+00:00— report_created — created