Upgrading an AKS cluster – how it works

Upgrading an AKS cluster – how it works

2019, Mar 06    

I was talking to someone about the way AKS handles cluster upgrades, which is pretty easy to understand.  You have a cluster which is sized to N nodes, and you request an upgrade.  The first thing we do is add another node to the cluster.  We then, one by one take a node out of the cluster available pool before updating the version of Kubernetes, and then put it back into the pool.  This happens for every node except the last one, which we just trash and leave you with the extra node that was added at the beginning of the process.

I’ve included the “kubectl get nodes -o wide -w” log below from a recent upgrade from 1.12.4 to 1.12.6 so you can see what I mean, and the timings that are involved.

It took around 3 minutes for the new node aks-agentpool-24883706-2 (of the correct 1.12.6 K8S version) to be added to my cluster.  Then a subsequent 4 minutes before the node aks-agentpool-24883706-0 had been upgraded.

If you also look at the private IP Address range that gets allocated to aks-agentpool-24883706-2 you can see why the formula for sizing your virtual network is important.  See: https://gordon.byers.me/azure/networking-basics-in-the-azure-kubernetes-service/

NAME                       STATUS   ROLES   AGE   VERSION   INTERNAL-IP   EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-agentpool-24883706-1   Ready   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-0   Ready   agent   33d   v1.12.4   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-2   NotReady   agent   0s    v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-2   NotReady   agent   1s    v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-1   Ready   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-0   Ready   agent   33d   v1.12.4   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-2   Ready   agent   11s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-1   Ready   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-0   Ready,SchedulingDisabled   agent   33d   v1.12.4   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-2   Ready   agent   21s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-2   Ready   agent   51s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-1   Ready   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-0   NotReady,SchedulingDisabled   agent   33d   v1.12.4   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-2   Ready   agent   112s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-1   Ready   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-0   NotReady,SchedulingDisabled   agent   33d   v1.12.4   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-2   Ready   agent   4m44s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-1   Ready   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-0   Ready   agent   1s    v1.12.6   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-1   Ready,SchedulingDisabled   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-2   Ready   agent   4m54s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-0   Ready   agent   11s   v1.12.6   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-0   Ready   agent   41s   v1.12.6   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-2   Ready   agent   5m34s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-1   NotReady,SchedulingDisabled   agent   33d   v1.12.4   10.240.0.35   <none>   Ubuntu 16.04.5 LTS   4.15.0-1035-azure   docker://3.0.1
aks-agentpool-24883706-0   Ready   agent   51s   v1.12.6   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-2   Ready   agent   5m44s   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-0   Ready   agent   10m   v1.12.6   10.240.0.4   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4
aks-agentpool-24883706-2   Ready   agent   15m   v1.12.6   10.240.0.66   <none>   Ubuntu 16.04.5 LTS   4.15.0-1037-azure   docker://3.0.4