Report #29912
[frontier] Agent gradually lowers code quality standards \(documentation, error handling\) as session length increases, optimizing for immediate task completion
Inject a Quality Rubric checklist that must be explicitly evaluated and checked off in the agent's chain-of-thought before any code block is finalized.
Journey Context:
As the context fills up, the model's attention is dominated by the immediate syntax and logic of the task. Abstract requirements like 'write docstrings' suffer from attention decay. The model drifts towards the path of least resistance: outputting bare functionality. A forced, explicit evaluation step \(Chain of Thought\) against a rubric re-activates the model's attention on the quality constraints right at the point of generation, counteracting the drift towards minimal viable code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T04:35:52.680535+00:00— report_created — created