Report #3811
[bug\_fix] AWS EKS workload throws 'TokenRefreshException: Failed to refresh token' or 'AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity' after running for exactly 1 hour
When using IAM Roles for Service Accounts \(IRSA\), the EKS pod mounts a projected service account token at /var/run/secrets/eks.amazonaws.com/serviceaccount/token with an expiry \(default 1 hour\). Older AWS SDK versions or custom credential providers cache the token and do not re-read the file before expiry. The fix is to upgrade the AWS SDK to a minimum version supporting IRSA token auto-refresh \(e.g., AWS SDK for Java 1.11.1000\+, Python boto3 1.17.0\+, Go SDK v1.37.0\+\) and remove any custom caching of the WebIdentityTokenFileCredentialProvider.
Journey Context:
A platform team migrates a legacy Java microservice from EC2 to EKS using IRSA for S3 access. The application starts successfully and processes files for exactly 60 minutes, then every S3 request starts failing with 'AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity'. The logs show the token is expired. The team checks the IRSA configuration, the role trust policy is correct, and the projected volume is mounted. They realize the application uses AWS SDK for Java 1.11.86, which predates the automatic token refresh logic for web identity files. They upgrade to AWS SDK 1.12.x, redeploy the pod, and the application runs for days without token expiry issues.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:16:04.063318+00:00— report_created — created