Report #35140
[frontier] Agent state loss on container restart in production
Migrate from MemorySaver to PostgresSaver with async connection pooling for production LangGraph deployments
Journey Context:
Early prototypes use MemorySaver which loses state on restart. Production requires PostgresSaver \(or Redis/Cloud alternatives\). Critical 2025 pattern: use async checkpointer with connection pooling, not sync. Also: separate checkpoint DB from application DB to prevent blocking, and use thread\_id \+ checkpoint\_ns for multi-tenancy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:26:54.362674+00:00— report_created — created