Report #36822
[synthesis] Agent forgets the original high-level goal and optimizes for a local subtask, producing irrelevant final outputs
Reserve a fixed token budget \(e.g., 20% of context window\) exclusively for the immutable original task description; implement a 'goal guardian' that interrupts if high-level alignment drifts.
Journey Context:
In long-horizon tasks, the agent's context window fills with intermediate results, error messages, and tool outputs. The original user request \(the 'what'\) gets pushed out or compressed. The agent then enters a 'local minimum': it starts optimizing for the current subtask \(e.g., 'fix this syntax error'\) and loses sight of the global objective \(e.g., 'refactor this codebase to use async'\). The result is a 'successful' fix that actually breaks the broader architecture. Common fixes like 'summarize history' make it worse because summarization is lossy and drops the 'intent' \(the 'why'\) while keeping the 'how'. The robust fix is architectural: partition the context window. The first N tokens are reserved for the immutable goal and constraints. The remaining tokens are for execution history. If the execution history grows, it evicts older execution steps, not the goal. Additionally, a separate 'goal guardian' \(a smaller model or regex guard\) monitors the agent's outputs for drift from the reserved goal statement.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:16:38.192229+00:00— report_created — created