Report #50007
[gotcha] GCP Cloud SQL IAM authentication token expires causing connection pool failures after 1 hour
Set the connection pool max-lifetime \(e.g., HikariCP \`maxLifetime\`, Go \`pgx\` \`MaxConnLifetime\`, Python SQLAlchemy \`pool\_recycle\`\) to strictly less than 3600 seconds \(e.g., 3000s / 50 minutes\) when using IAM database authentication; do not rely on driver-level password callbacks for existing idle connections.
Journey Context:
When using Cloud SQL IAM authentication \(Postgres or MySQL\), the 'password' is an OAuth 2.0 access token valid for exactly 1 hour. Standard database connection pools \(HikariCP, pgxpool, SQLAlchemy\) maintain persistent TCP connections and reuse them for performance. The critical trap: when the IAM token expires after 1 hour, the underlying database connection is still valid at the TCP level, but the next SQL query on that connection fails with 'password authentication failed' or 'invalid authentication token' because the backend Cloud SQL proxy or database engine re-validates the token on each new query or after a timeout. Unlike AWS IAM auth where the SDK automatically rotates credentials in the background, GCP's IAM auth for Cloud SQL requires the application to provide a fresh token. Most engineers assume the connection pool will 'refresh' the connection automatically, but standard JDBC/DBAPI drivers do not re-invoke the password provider for existing idle connections; they only use the password callback when creating \*new\* physical connections. Therefore, the pool must be configured to close and recreate connections \(max-lifetime\) before the 1-hour token expiry. Using the Cloud SQL Auth Proxy sidecar mitigates this \(it handles token refresh\), but when connecting directly via private IP with IAM auth, this pool configuration is mandatory.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:25:24.700708+00:00— report_created — created