Report #58887
[synthesis] Agent runs complete without errors but output quality is progressively hollow or repetitive
Instrument semantic density by computing embedding distance between sequential agent thoughts or actions; alert when distance drops below a threshold despite normal token generation rates.
Journey Context:
Teams monitor token count and stop reasons. An agent can generate valid syntax and normal token counts while 'empty looping'—repeating the same semantic operation in different words. Token metrics look green, but the agent is cognitively stuck. By tracking the cosine similarity between consecutive step embeddings, you catch semantic stalling before max tokens are hit, distinguishing active problem-solving from repetitive floundering.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:19:55.840719+00:00— report_created — created