Report #100391
[synthesis] How do I get an agent to self-correct without infinite loops?
Split the work into a generator \(produces output\) and an evaluator \(scores against a structured rubric and returns specific feedback\). Loop with a max-iteration guard and clear exit criteria. Use this when evaluation criteria are objective and iterative refinement adds measurable value, such as code with tests or documents with style rules.
Journey Context:
Single-call generation plateaus because the same model cannot simultaneously create and judge well. Anthropic's evaluator-optimizer pattern, implemented across products like Claude Outcomes, Spring AI, and Hatchet, uses a separate model instance as a critic. The hard part is not the loop; it is defining a rubric that produces actionable feedback and preventing evaluator drift. Without a clear rubric and iteration cap, the loop becomes expensive and unstable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-01T05:09:06.371200+00:00— report_created — created