Cluster Maintenance in Kubernetes

OS Upgrades

  • Consider we have a cluster with few nodes and pods serving the applications on those nodes.

  • What happens when one of the nodes goes down?

    • The pods on them will not be accessible.
  • If the node comes back immediately, then the kubelet process starts and the pods come back online.

  • However, if the node was down for more than five minutes, then the pods are terminated from that node. Kubernetes considers them as dead.

  • If the pods were part of ReplicaSet, then they are recreated on other nodes.

  • The time it waits for a pod to come back online is known as the pod-eviction-timeout and is set on the controller manager with a default value of five minutes.

  • So, whenever a node goes offline, the master node waits for up to five minutes before considering the node dead.

  • When the node comes back online, after the pod-eviction-timeout, it comes up blank without any pod scheduled on it.

  • If we are not sure if a node is going to be back online in five minutes, then there is a safer way to do it.

    • We can purposefully drain the nodes of all the workloads so that the workloads are moved(terminated and created on another node) to other nodes in the cluster.

    • When we drain a node, the pods are gracefully terminated from the nodes that they are on and recreated on another.

    • The node is also cordoned or marked as unschedulable, meaning no pods can be scheduled on this node until we specifically remove the restriction.

    • Now the pods are safe on the other node, we can reboot the first node.

    • When it comes back online, it is still unschedulable. You then need to uncordon it so that the pods can be scheduled on it again.

    • Remember, the pods that were moved to the other node don't automatically fall back.

    • If any of those were deleted or if new pods were created in the cluster, then they would be created on this node.

    • Apart from drain and uncordon, there is also another concept called cordon.

    • Cordon simply marks the node unschedulable.

    • Unlike drain, it does not terminate or move the pods on an existing node.

    • It simply makes sure that the new pods are not scheduled on that node.

Commands

  1. kubectl drain <node_name>

  2. kubectl cordon <node_name>

  3. kubectl uncordon <node_name>

Cluster Upgrade Process

  • Usually, core control plane components in Kubernetes have the same version (kube-apiserver, controller-manager, kube-scheduler, kubelet, kube-proxy). But it is not mandatory.

  • Since the kube-apiserver is the primary component in the control plane, and that is the component that all other components talk to,none of the other components should ever be at a version higher than the kube-apiserver.

  • The controller-manager and kube-scheduler can be at one version lower. i.e. if kube-apiserver is at version x, controller-manager and kube-scheduler can be at x-1 and the kubelet and kube-proxy components can be at two version lower, x-2.

  • None of them could be at a version higher than the kube-apiserver.

  • But this is not the case with kubectl. The kubectl can be at one version higher or one version lower i.e. x+1 or x-1.

  • At any time, Kubernetes supports up to the recent 3 minor versions.

  • The recommended approach is to upgrade one minor version at a time, like version 1.10 to 1.11 then 1.11 to 1.12, then 1.12 to 1.13.

Backup and Restore

Backup Candidates

  1. Resource Configuration

    • The declarative approach is the preferred approach for creating the application using Kubernetes (Kubernetes Object Definition File).

    • We should store the resource configuration files on a source code repository like GitHub, which manages and maintains the versions and backup of files.

    • But there may be a case, where a team member does the deployment using a imperative way.

    • So a better approach to backing up resource configuration is to query the KubeAPI Server.

    • Query the KubeAPI Server using the kubectl command, and save all resource configurations for all objects created on the cluster as a copy.

    • For Ex. One of the commands that can be used in a backup script is

      kubectl get all --all-namespaces -o yaml > all-deploy-service.yaml

    • It takes all the deployments, pods, services in all namespaces using kubectl get all command, and extracts the output in a YAML format and saves the file.

    • VELERO formally known as ARK by HeptIO can be used to take the backups of Kubernetes cluster (third-party solution).

  2. ETCD Cluster

    • ETCD cluster includes all the information about the cluster, i.e. all information about the pods, deployments, services, etc.

    • So instead of backing up the resource configuration, we can take a backup of ETCD cluster itself.

    • As we know, ETCD is hosted on the master node. While configuring etcd, we can specify a location where all the data would be stores, i.e. data directory.

    • This is the directory that can be configured to be backed up by the backup tool.

    • ETCD also comes with a built-in snapshot solution.

    • We can take the snapshot of the etcd database by using the etcdctl utility's snapshot save command. snapshot.db is the snapshot name.

      etcdctl snapshot save snapshot.db

    • After this, a snapshot file is created by the name snapshot.db in the current directory.

    • If we want it to be created in another location, specify the full path along with the name.

    • We can view the status of the backup using snapshot status command.

      etcdctl status snapshot.db

    • To restore this cluster from backup at a later point in time:

      • First stop the Kube API Server service, as the restore process will require you to restart the ETCD cluster, and the Kube API Server depends on it.

      • Then run the etcdctl snapshot restore command with the path set to the path of the backup file.

      • When ETCD restores from backup, it initializes a new cluster configuration and configures the members of ETCD as new members to a new cluster.

      • This is to prevent a new member from accidentally joining an existing cluster.

      • On running this command, a new data directory is created.

      • We then configure the ETCD configuration file to use the new data directory.

      • After this, reload the service daemon, and restart the etcd service and kubeapi service.

    • With all the ETCD commands, we have specify the certificate files for authentication, specify the endpoint to the ETCD cluster and the ca-certificate, the etcd-certificate and the key.

  3. Persistent Volumes