An Introduction to Kubernetes DaemonSet

Introduction

Kubernetes is a powerful platform for managing containerized applications. It allows you to deploy, scale, and update your applications with ease. However, sometimes you may need to run some tasks or processes on every node in your cluster, such as monitoring, logging, or storage. How can you achieve that with Kubernetes?

The answer is DaemonSets. A DaemonSet is a special kind of Kubernetes resource that ensures that a copy of a specific pod is running on all (or a subset of) nodes in the cluster. This way, you can have a consistent and reliable environment for your applications across your nodes.

In this article, we will learn what DaemonSets are, how to create and use them, and some tips for working with them.

What is a DaemonSet?

A DaemonSet is a controller that manages the lifecycle of pods that run on each node in the cluster. It creates and deletes pods as nodes join or leave the cluster, and ensures that the pods are always running and healthy.

Some typical use cases for DaemonSets are:

Running a cluster storage daemon on every node, such as GlusterFS or Ceph.
Running a logs collection daemon on every node, such as Fluentd or Logstash.
Running a node monitoring daemon on every node, such as Prometheus Node Exporter or Datadog Agent.
Running a troubleshooting tool on every node, such as Node Problem Detector or Sysdig.

A DaemonSet is similar to a Deployment or a StatefulSet in that it manages a set of pods that provide a service. However, unlike Deployments or StatefulSets, which can distribute pods across nodes based on resource availability and scheduling policies, a DaemonSet ensures that there is exactly one pod per node that matches the node selector. This way, you can have a pod that runs on every node, regardless of its resource constraints or labels.

How to create and use a DaemonSet

To create a DaemonSet, you need to define a YAML file that specifies the pod template, the node selector, and other optional parameters. Here is an example of a DaemonSet YAML file that runs the Fluentd logging agent on every node in the kube-system namespace:

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      tolerations:
      # these tolerations are to have the daemonset runnable on control plane nodes
      # remove them if your control plane nodes should not run pods
      - key: node-role.kubernetes.io/control-plane
        operator: Exists
        effect: NoSchedule
      - key: node-role.kubernetes.io/master
        operator: Exists
        effect: NoSchedule
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log

Let’s break down the YAML file and see what each field means:

apiVersion: apps/v1: Specifies the API version of the Kubernetes resource being defined, which is a DaemonSet.
kind: DaemonSet: Specifies the type of Kubernetes resource being defined, which is a DaemonSet.
metadata: Contains metadata about the DaemonSet, including its name, namespace, and labels.
spec: Contains the specification of the DaemonSet, including the pod template and the node selector.
selector: Defines the label selector that determines which pods belong to the DaemonSet. It must match the labels of the pod template.
template: Defines the pod template that the DaemonSet will create on each node. It has the same schema as a pod, except it does not have an apiVersion or kind.
tolerations: Defines the tolerations that allow the pod to be scheduled on nodes with taints. In this case, the pod can run on control plane nodes, which are usually tainted with node-role.kubernetes.io/control-plane and node-role.kubernetes.io/master. You can remove these tolerations if you don’t want the pod to run on control plane nodes.
containers: Defines the container(s) that run in the pod. In this case, there is only one container, named fluentd-elasticsearch, that runs the Fluentd logging agent image from Quay.io. The container also specifies the resource limits and requests, and the volume mounts.
terminationGracePeriodSeconds: Defines the grace period for the pod to terminate gracefully. In this case, it is set to 30 seconds.
volumes: Defines the volumes that are available to the pod. In this case, there is only one volume, named varlog, that mounts the host’s /var/log directory to the pod.

To create the DaemonSet, you can use the kubectl apply command and pass the YAML file as an argument:

kubectl apply -f daemonset.yaml

This will create the DaemonSet and the pods on each node. You can verify the status of the DaemonSet and the pods using the kubectl get command:

kubectl get daemonset -n kube-system
NAME                  DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
fluentd-elasticsearch   3         3         3       3            3           <none>          5m

kubectl get pods -n kube-system -l name=fluentd-elasticsearch
NAME                        READY   STATUS    RESTARTS   AGE
fluentd-elasticsearch-4j9fz   1/1     Running   0          5m
fluentd-elasticsearch-6q8xh   1/1     Running   0          5m
fluentd-elasticsearch-z7w9s   1/1     Running   0          5m

As you can see, the DaemonSet has created three pods, one on each node in the cluster. The pods have the same name as the DaemonSet, followed by a random suffix. The pods are also labeled with the name of the DaemonSet, which is used for the selector.

You can also use the kubectl describe command to get more details about the DaemonSet and the pods:

kubectl describe daemonset -n kube-system fluentd-elasticsearch
Name:           fluentd-elasticsearch
Selector:       name=fluentd-elasticsearch
Node-Selector:  <none>
Labels:         k8s-app=fluentd-logging
Annotations:    deprecated.daemonset.template.generation: 1
Desired Number of Nodes Scheduled: 3
Current Number of Nodes Scheduled: 3
Number of Nodes Scheduled with Up-to-date Pods: 3
Number of Nodes Scheduled with Available Pods: 3
Number of Nodes Misscheduled: 0
Pods Status:  3 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:       name=fluentd-elasticsearch
  Containers:
   fluentd-elasticsearch:
    Image:      quay.io/fluentd_elasticsearch/fluentd:v2.5.2
    Port:       <none>
    Host Port:  <none>
    Limits:
      memory:  200Mi
    Requests:
      cpu:     100m
      memory:  200Mi
    Environment:  <none>
    Mounts:
      /var/log from varlog (rw)
  Volumes:
   varlog:
    Type:          HostPath (bare host directory volume)
    Path:          /var/log
    HostPathType:  
Events:
  Type    Reason            Age   From                  Message
  ----    ------            ----  ----                  -------
  Normal  SuccessfulCreate  6m    daemonset-controller  Created pod: fluentd-elasticsearch-4j9fz
  Normal  SuccessfulCreate  6m    daemonset-controller  Created pod: fluentd-elasticsearch-6q8xh
  Normal  SuccessfulCreate  6m    daemonset-controller  Created pod: fluentd-elasticsearch-z7w9s

The output shows the desired and current number of nodes scheduled, the pods status, the pod template, the volumes, and the events related to the DaemonSet.

How to use labels and node selectors to limit the nodes where the DaemonSet pods are scheduled

By default, a DaemonSet will create pods on every node in the cluster that matches the node selector. However, sometimes you may want to limit the nodes where the DaemonSet pods are scheduled, for example, to run the pods only on nodes with a certain label, or to exclude nodes with a certain taint.

To do that, you can use labels and node selectors to filter the nodes based on their attributes. Labels are key-value pairs that you can attach to any Kubernetes object, such as nodes, pods, or services. Node selectors are expressions that match nodes based on their labels.

For example, suppose you have a cluster with three nodes, each with a different label:

kubectl get nodes --show-labels
NAME      STATUS   ROLES    AGE   VERSION   LABELS
node1     Ready    <none>   10d   v1.21.2   env=prod
node2     Ready    <none>   10d   v1.21.2   env=dev
node3     Ready    <none>   10d   v1.21.2   env=test

If you want to run the DaemonSet pods only on the nodes with the label env=prod, you can add a node selector to the DaemonSet spec, like this:

spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      nodeSelector:
        env: 
          NotIn: 
          - dev
      # rest of the pod spec

The NotIn operator matches nodes that do not have the specified values for the label. You can use multiple operators to create more complex expressions, as long as they are valid and consistent.

Note that node selectors only filter the nodes based on their labels, but they do not guarantee that the pods will be evenly distributed across the nodes. If you want to achieve a more balanced distribution of pods, you can use other features such as affinity and anti-affinity or topology spread constraints.

Conclusion

In this article, you learned what DaemonSets are, how to create and use them, and how to use labels and node selectors to limit the nodes where the DaemonSet pods are scheduled. You also saw some examples of use cases for DaemonSets, such as running cluster storage, logging, monitoring, or troubleshooting daemons on every node.

DaemonSets are a useful resource for ensuring that a pod is running on all (or a subset of) nodes in the cluster. They can help you provide a consistent and reliable environment for your applications across your nodes. However, they also have some limitations and challenges, such as managing the pod updates, balancing the pod distribution, and handling the node failures.

To overcome these challenges, you can use other features such as rolling updates, affinity and anti-affinity, topology spread constraints, or node lifecycle controllers. You can also explore other types of controllers, such as Deployments, StatefulSets, Jobs, or CronJobs, depending on your application needs and requirements.