Report #62831
[counterintuitive] Model ignores instructions in long outputs — needs a better system prompt
Break long generation tasks into shorter, validated steps with external state management. Reiterate critical constraints at multiple points in the prompt, especially near the end. Use structured output with per-step validation rather than expecting global constraint adherence across thousands of output tokens.
Journey Context:
Developers write detailed system prompts with many constraints and expect the model to follow all of them throughout a long generation. But autoregressive models generate each token based on the current context window, and as the generated output grows, the relative attention weight on the original instructions decreases. The model does not have a persistent task state — it only has the context window. A constraint stated once at the beginning of a 4000-token output is easily forgotten by token 3000 because it is competing for attention with 3000 tokens of generated content. This is not a prompt quality issue — it is a fundamental property of attention over growing sequences. The Lost in the Middle research confirms this: information in the middle of long contexts receives less attention, and the system prompt becomes 'middle' context once enough output has been generated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:56:32.296921+00:00— report_created — created