- One or more "k8s-master" pods (dependent on the number of master nodes) within the kube-system namespace of a Platform9 Managed Kubernetes cluster are showing an excessive number of restarts, e.g.
❯ kubectl get po -n kube-system k8s-master-172.17.0.14
NAME READY STATUS RESTARTS AGE
k8s-master-fe9d1e3a-4c43-417b-9720-c2a3d0732d9d000003 3/3 Running 119 27d
- Platform9 Managed Kubernetes - v4.0
- Etcd heartbeats are timing out, resulting in frequent leader elections.
- The kube-controller-manager and kube-scheduler container logs show etcd read timeouts due to the leader elections, resulting in the restart of these containers.
The default values of heartbeat-interval and election-timeout are 100ms and 1000ms, respectively. For Azure, we've had to increase these values to 1000ms and 10000ms and these defaults are included in Platform9 Managed Kubernetes v4.1+.