AKS VM service degradation issue after AKS upgrade

Run into this issue with AKS VM degration. The interesting thing here is node pool indicate that the node is up and running. But when I review the VMSS instances in infrastructure resoure group. I notice one of the VM has been having service degradation for 1 month or maybe more. 

To resolve this - is quite easy, just redeploy the VM instance to a new VM. It should work but if it is a production then we need to be sure, someone gets notified and make sure you drain and evicted pods in the VMs first. Finally test, test and test to make sure everything still works.



Comments

Popular posts from this blog

gemini cli getting file not defined error

NodeJS: Error: spawn EINVAL in window for node version 20.20 and 18.20

vllm : Failed to infer device type