Report #40005
[bug\_fix] AWS AccessDenied when running on EC2/EKS despite correct IAM Role due to stale env vars
Unset the environment variables \`AWS\_ACCESS\_KEY\_ID\`, \`AWS\_SECRET\_ACCESS\_KEY\`, and \`AWS\_SESSION\_TOKEN\` in the process/container, or explicitly configure the SDK to use the Instance Metadata Service \(IMDS\) or WebIdentityTokenFile by setting the credential provider explicitly in code. Root cause: The AWS SDK credential provider chain prioritizes environment variables over instance metadata; if stale developer credentials or session tokens from a previous \`aws sts assume-role\` call remain in the environment, the SDK uses those instead of the attached EC2 Instance Profile or EKS Pod Identity/IRSA.
Journey Context:
A data engineering team migrates a Spark job from an EMR cluster to an EKS cluster using IRSA \(IAM Roles for Service Accounts\). The pod has the correct service account annotation. However, the driver logs show 'AccessDenied: User: arn:aws:iam::123:user/old-dev-user is not authorized to perform: s3:GetObject'. The engineer checks the IRSA env vars—\`AWS\_ROLE\_ARN\` and \`AWS\_WEB\_IDENTITY\_TOKEN\_FILE\` are present. However, checking \`os.environ\` in the Python REPL inside the pod reveals \`AWS\_ACCESS\_KEY\_ID\` is set to an old dev key. Investigating the Dockerfile, they find \`ENV AWS\_ACCESS\_KEY\_ID=...\` baked in from a previous iteration. The AWS SDK \(boto3\) checks env vars before the WebIdentityTokenFile in the default chain, so it uses the dev user. Removing the ENV lines from the Dockerfile and ensuring the CI/CD pipeline doesn't inject these vars allows the SDK to use the IRSA token file. This works because the default credential provider chain order is: env vars → shared config file → WebIdentityTokenFile → IMDS; stale env vars shadow the intended pod role.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:37:18.430515+00:00— report_created — created