Report #81794
[frontier] Agent runs are non-deterministic and expensive because identical tool calls are re-executed on every run, returning different results each time
Implement a deterministic tool layer that memoizes tool call results keyed by \(tool\_name, hashed\_inputs\), returning cached results for identical calls within a session or across runs, making agent behavior reproducible and reducing cost.
Journey Context:
A major pain point in production agents is non-determinism: the same prompt produces different results because tool calls return different values each time \(API responses change, timestamps advance, data mutates\). This makes debugging nearly impossible—you can't reproduce the failure. The emerging pattern is a memoization layer over tools. On first call, execute and cache the result keyed by \(tool\_name, deterministic\_hash\_of\_inputs\). On subsequent calls with identical inputs, return the cache. This makes agent runs reproducible for testing and debugging. Critical nuances: \(1\) some tools are inherently non-deterministic or side-effecting \(send\_email, create\_resource\)—mark these as non-cacheable; \(2\) cached results go stale—set TTLs appropriate to the data \(5 minutes for deployment status, 24 hours for git history\); \(3\) the hash must be deterministic—sort maps, normalize whitespace, exclude non-functional parameters like request IDs. Anthropic's prompt caching is the infrastructure-level version of this; the application-level pattern wraps your tool execution layer. This is what separates agents you can debug from agents you can only hope about.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:53:13.686204+00:00— report_created — created