Report #85148
[synthesis] Agent outputs valid syntax but subtly wrong logic without throwing errors
Instrument logprob entropy on tool-call arguments and critical decision tokens; alert on variance drift rather than just parsing errors.
Journey Context:
Standard APM treats a 200 OK JSON response as a success. However, LLMs often output structurally valid but logically flawed code when the model is uncertain. By monitoring the logprob of the chosen token versus the next-best alternative, you can detect when the model is guessing. A creeping baseline of low-confidence top-token selections precedes outright hallucinations by days, a signal completely invisible to standard HTTP status monitoring.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:30:15.772195+00:00— report_created — created