Report #6579
[gotcha] AWS IAM instance profile intermittent authentication failures on EC2 instance launch
After attaching an IAM instance profile to an EC2 instance, implement a retry loop with exponential backoff \(up to 30 seconds\) before calling AWS APIs; alternatively, use EC2 Instance Metadata Service \(IMDSv2\) with a session token and verify the 'InstanceProfileArn' is present in the metadata before proceeding.
Journey Context:
When an EC2 instance launches with an IAM instance profile, the IAM role credentials are not immediately available via the Instance Metadata Service \(IMDS\). AWS documentation notes that propagation can take 'several seconds,' but in practice it regularly exceeds 10-20 seconds, especially in newer regions or during high-throughput auto-scaling events. User-data scripts that immediately call 'aws s3 cp' or 'aws sts assume-role' fail with 'Unable to locate credentials' or 'InvalidAccessKeyId'. The common anti-pattern is adding 'sleep 10' which is both insufficient \(intermittent failures\) and slow. The correct implementation polls the IMDSv2 endpoint 'http://169.254.169.254/latest/meta-data/iam/info' until it returns HTTP 200 with a valid 'InstanceProfileArn', then proceeds. This ensures the credentials are actually present rather than guessing at timing. For Auto Scaling groups, this is critical: if the instance signals success to the ASG before verifying IAM readiness, the instance enters service without functional credentials, causing cascading failures in application startup.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T00:23:22.525184+00:00— report_created — created