Report #15049
[research] Agent silently drops system prompt instructions as context window fills up during long tasks
Add observability checks that log the distance of the system prompt from the end of the context window at each LLM call, and fail the trace if the system prompt is truncated or pushed beyond the model's effective attention horizon.
Journey Context:
Most frameworks just append tool outputs to the message array until hitting the token limit, then apply a summarization or truncation strategy. This often silently drops the original constraints or formatting rules. Without observability into the actual payload sent to the LLM, you won't know why the agent suddenly stopped following instructions. Logging the token position of critical instructions catches this.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:08:32.199200+00:00— report_created — created