Report #96672
[bug\_fix] google.auth.exceptions.DefaultCredentialsError: Could not automatically determine credentials... metadata server not reachable / 404 when using Workload Identity on GKE
Recreate the GKE node pool with \`--workload-metadata-from-node=GKE\_METADATA\` \(or \`mode: GKE\_METADATA\` in the API\). Root cause: GKE Workload Identity relies on a node-level GKE Metadata Server \(GKE\_METADATA\) that intercepts calls to 169.254.169.254. If the node pool uses EXPOSED or SECURE mode \(legacy GCE metadata\), pods access the node's Compute Engine metadata server directly, which does not have the KSA-to-GSA mapping, resulting in 404s or wrong credentials.
Journey Context:
You enable Workload Identity on your GKE cluster and follow the docs to annotate your Kubernetes Service Account \(KSA\) with the GCP Service Account \(GSA\) email. You deploy your workload. The pod logs show 'DefaultCredentialsError: Could not automatically determine credentials'. You verify the KSA annotation is correct. You exec into the pod and run \`curl http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token\` and receive a 404 Not Found. You run the same command on the node \(via SSM or serial console\) and it works, returning a token for the node's default service account. You inspect the node pool configuration with \`gcloud container node-pools describe\` and see \`workloadMetadataConfig.mode: EXPOSED\` or no workloadMetadataConfig set \(defaulting to GCE\_METADATA\). You realize that for Workload Identity to function, the node pool must have \`GKE\_METADATA\` mode enabled to start the GKE metadata proxy. You create a new node pool with \`--workload-metadata-from-node=GKE\_METADATA\`, migrate the workload, and the pod successfully retrieves the token via the metadata server using the mapped GSA.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:50:53.625504+00:00— report_created — created