You've successfully subscribed to Nuvotex Blog
Great! Next, complete checkout for full access to Nuvotex Blog
Welcome back! You've successfully signed in.
Success! Your account is fully activated, you now have access to all content.
Success! Your billing info is updated.
Billing info update failed.

kubernetes, cgroups v2 and failing health probes

Running Kubernetes on a systemd-based OS can lead to pods restarting endlessly due to health probe failures, caused by the systemd cgroup driver. Previously, this was fixed by modifying boot options, but a simpler approach is to switch the kubelet to use cgroupfs instead.

Daniel Nachtrub
Daniel Nachtrub

A while ago, i wrote a post about systemd based systems, cgroups v2 and kubernetes (https://blog.nuvotex.de/ubuntu-22-04-or-21-10-kubernetes-cgroups-v2/). The issue described there was about kubeadm created kubernetes clusters on systemd based hosts.

Just as a recap:

Running kubernetes on a systemd based operating system might result in the behavior that pods are started but the health probes will not succeed and therefore cause pods to be restarted over and over again.

The reason is somewhat related the the cgroup driver (systemd) and the fact systemd cgroup hierarchy. The old blog post presented a fix by setting unified_cgroup_hierarchy=0 via boot options. Time for another approach.

Another solution - switch cgroup driver

If you don't want to fiddle with grub or the systemd cgroup behavior, you can just switch the kubelet to use cgroupfs. As of today there's not a real difference on the cgroup driver - you might be having a kubeadm based cluster and kubeadm selected systemd for you (if it detects systemd on the host).

Adjust the cgroup driver by changing the kubelet config at /var/lib/kubelet/config.yaml or use this command

sed -i 's/cgroupDriver: systemd/cgroupDriver: cgroupfs/' /var/lib/kubelet/config.yaml

switch cgroup driver to cgroupfs

Personally, I prefer switching the cgroup driver as it's a more elegant solution and has no deviation from defaults on the system (like on containerd or OS cgroup settings itself).

CloudContainerKubernetesLinuxMonitoring

Daniel Nachtrub

Kind of likes computers. Linux foundation certified: LFCS / CKA / CKAD / CKS. Microsoft certified: Cybersecurity Architect Expert & Azure Solutions Architect Expert.