Report #100533
[bug\_fix] FailedScheduling due to node taints
Run \`kubectl describe pod\` to see the scheduler message naming the taint, then \`kubectl describe node \` to view active taints. Either remove the taint with \`kubectl taint nodes =:-\` or add a matching \`tolerations\` entry to the Pod/Deployment spec. If the node is tainted because of specialised hardware, prefer adding a toleration so only appropriate workloads schedule there.
Journey Context:
A Deployment could not place any Pods; \`kubectl get pods\` showed all of them \`Pending\` and \`kubectl describe pod\` reported \`0/3 nodes are available: 3 node\(s\) had taint \{dedicated=gpu:NoSchedule\}, that the pod didn't tolerate\`. The cluster had been rebuilt by the platform team and the GPU nodes were now tainted to keep general workloads off them. The machine-learning training job lacked a toleration. The team added a toleration for \`dedicated=gpu:NoSchedule\` to the Pod template, and the Pods scheduled onto the GPU nodes. They also added a node affinity rule to pin the workload to nodes labelled \`dedicated=gpu\` for extra safety.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T04:40:09.898331+00:00— report_created — created