Kured(Kubernets reboot daemon) case study in AKS during worker node patching

Introduction:-

Kured (Kubernetes Reboot Daemon) is a Kubernetes daemon-set that performs safe automatic node reboots when the need to do so is indicated by the package management system of the underlying OS.

  • Watches for the presence of a reboot sentinel file e.g. or the successful run of a sentinel command.
  • Utilises a lock in the API server to ensure only one node reboots at a time
  • Optionally defers reboots in the presence of active Prometheus alerts or selected pods
  • Cordons & drains worker nodes before reboot, uncordoning them after

AKS cluster donot have inbuilt procedure defined node reboot an AKS cluster with precaution , as more and more application is migrated to AKS , it is very much necessary to update/patch the Kubernetes worker node to mitigate the vulnerability.

In an AKS cluster, Kubernetes nodes run as Azure virtual machines (VMs). These Linux-based VMs use an Ubuntu image, with the OS configured to automatically check for updates every night. If security or kernel updates are available, they are automatically downloaded and installed.

Some security updates, such as kernel updates, require a node reboot to finalize the process. A Linux node that requires a reboot creates a file named /var/run/reboot-required. This reboot process doesn’t happen automatically

Kured Workflow:-

Kured Work flow

, a Daemonset is deployed that runs a pod on each Linux node in the cluster. These pods in the DaemonSet watch for existence of the /var/run/reboot-required file, and then initiate a process to reboot the nodes.

Kubernetes & OS Compatibility:-

The daemon image contains versions of and (the binary of in older releases) for the purposes of maintaining the lock and draining worker nodes. Kubernetes aims to provide forwards and backwards compatibility of one minor version between client and server:

Kured datasheet for Supported Kubernetes version

Installation:- Kured can be deployed using HELM Chart and values of reboot intervals , timing and day can be customized based on the HELM chart value.

Then we have to execute the below command to install kured

helm — version 2.4.0 install kured kured/kured — namespace kured — valueskured_value.yaml — set nodeSelector.”beta\.kubernetes\.io/os”=linux

Worker package verification for kured functionality:-

Worker node reboot execution by kured:-

Here kured execute the node reboot with proper node scheduling one at a time, now it checks the file and take decisions that node reboot is not required.

Worker node status after reboot

Kured provides the flexibility to automate the scheduling of the pods and cordon/drain the node safely , It reboots the node one by one so that there will not have any downtime on the running application.

13 + years of experience on system integratiin on Linux and 4 years in cloud/devops & ansible LinkedIn:-https://www.linkedin.com/in/indranil-banerjee-894a5016