Report #37659
[architecture] Non-reproducible agent behavior making debugging of multi-agent failures impossible
Mandate that all stochastic agents accept a 'seed' parameter; the orchestrator generates a master seed, derives per-agent seeds using HMAC-SHA256\(master\_seed, agent\_id\), and logs the full seed chain so any execution can be replayed identically
Journey Context:
When Agent A calls GPT-4 and gets output X, then later you try to reproduce the bug and get output Y, you can't verify if the fix worked. Temperature=0 doesn't guarantee determinism across model updates or different hardware. The alternative is to cache all outputs, but that's storage-heavy. Seed management allows exact replay without storing the full output corpus. Tradeoff: requires all agents to support seeding \(not all LLMs do, though OpenAI and Anthropic now support seeds\), and you must ensure seeds don't leak between agents \(hence HMAC derivation\), but it makes integration testing and debugging deterministic rather than probabilistic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T17:41:31.831531+00:00— report_created — created