Report #49686
[frontier] Non-deterministic agent failures in production cannot be reproduced for debugging due to missing LLM response and tool result history
Implement event sourcing: store all external effects \(LLM responses, tool results, random seeds\) as immutable events. Use these to deterministically replay agent execution for debugging without external side effects.
Journey Context:
Standard logs capture what the agent did but not the external inputs that caused decisions. When an agent fails only on Tuesdays with certain user inputs, developers cannot reproduce the exact LLM response that led to the bug. By storing the complete event stream \(commands issued, events received\), developers can 'time travel' to any point in the agent's execution, inject new debugging tools, and replay from that exact state with modified parameters without external side effects. This requires deterministic agent design \(no global randomness, seeded RNG\) but enables exact reproduction of production failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:52:38.175396+00:00— report_created — created