Kubernetes has managed to be the leading solution for container orchestration nowadays. It really is a powerful system with endless possibilities for microservices-based system designs. But as we know, with great power, comes great responsibility 🙂
Getting good observability on a system running numerous microservices on lots of nodes can be challenging. Containers are constantly on the move – starting up, terminating, moving between hosts, and whatnot. Although it’s very agile and can save costs, tracing all of our services and processing all of their outputs? Not so great. Luckily, we have the elastic stack.
In this post, I’ll go over the entire stack deployment using this open git repository. All of the components will be deployed to a namespace called “logging” – but you can change it to whatever you like. Just make sure to update all the YAMLs of the stack.
TL;DR
You can open the kubernetes-elastic-visibility repo on GitHub and run the command in the README.md. You’ll have the full observability stack deployed and ready to use on your cluster.
Namespace
Let’s prepare the terrain and create our namespace. Deploy the following YAML to create the “logging” namespace:
kind: Namespace apiVersion: v1 metadata: name: logging
Elasicsearch Cluster
We’ll deploy the Elasticsearch cluster as a StatefulSet. Kubernetes StatefulSets is the preferred way of deploying applications that require state. The best example is Databases, that require a persistent disk to be mounted and in some cases, some ordering to the cluster nodes start times. Elasticsearch is no different, we want to be able to tell each node how to reach other nodes in the cluster, and we want to be able to mount the correct disk to each node. StatefulSets gives your pods the naming convention necessary for the nodes discovery and works with PersistatentVolumeClaim that can guarantee we mount the correct disks.
Our Elasticsearch cluster will consist of 3 master nodes under a StatefulSet and a single service that will allow other workloads inside the cluster to communicate with it.
Note: that there are a lot of ways to run Elasticsearch cluster, using only masters is only one of them. You can combine masters, data, and ingest node roles in your cluster depending on your requirements.
apiVersion: apps/v1 kind: StatefulSet metadata: name: es-cluster namespace: logging spec: serviceName: elasticsearch replicas: 3 selector: matchLabels: app: elasticsearch template: metadata: labels: app: elasticsearch spec: containers: - name: elasticsearch image: docker.elastic.co/elasticsearch/elasticsearch:7.9.3 resources: limits: cpu: 1000m requests: cpu: 100m ports: - containerPort: 9200 name: rest protocol: TCP - containerPort: 9300 name: inter-node protocol: TCP volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data env: - name: cluster.name value: k8s-logs - name: node.name valueFrom: fieldRef: fieldPath: metadata.name - name: discovery.seed_hosts value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch" - name: cluster.initial_master_nodes value: "es-cluster-0,es-cluster-1,es-cluster-2" - name: ES_JAVA_OPTS value: "-Xms512m -Xmx512m" initContainers: - name: fix-permissions image: busybox command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"] securityContext: privileged: true volumeMounts: - name: data mountPath: /usr/share/elasticsearch/data - name: increase-vm-max-map image: busybox command: ["sysctl", "-w", "vm.max_map_count=262144"] securityContext: privileged: true - name: increase-fd-ulimit image: busybox command: ["sh", "-c", "ulimit -n 65536"] securityContext: privileged: true volumeClaimTemplates: - metadata: name: data labels: app: elasticsearch spec: accessModes: [ "ReadWriteOnce" ] storageClassName: gp2 resources: requests: storage: 100Gi
Pay attention to the ENV section where we set the cluster hosts. Because we use a StatefulSet with the name “es-cluster”, each of our nodes will be assigned an index number starting at 0. Because of that promise, we can set the seeding hosts environment variable in advance with:
es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch
Pattern: <SET-NAME>-<INDEX>.<SERVICE-NAME>
Under the volumeClaimTemplates section, you can see we are assigning persistent storage of 100Gi to each pod using the storage class gp2. The size can be changed to whatever you forecast your cluster needs, also the gp2 storage class should be changed depending on your cloud provider. I’m using gp2 because my cluster is hosted on AWS and I want to use a General Purpose SSD.
In order for our cluster to be reachable for other services, we need to create a service. Because all of our nodes are masters, we can read/write to either of them.
kind: Service apiVersion: v1 metadata: name: elasticsearch namespace: logging labels: app: elasticsearch spec: selector: app: elasticsearch clusterIP: None ports: - port: 9200 name: rest - port: 9300 name: inter-node
This service definition will expose our nodes under the DNS name of elasticsearch, making it easier to find and load-balanced from other deployments.
Kibana
Kibana will be our UI tool to visualize and explore the data in Elasticsearch. We’ll deploy a simple Deployment component to the cluster with a single pod to run the app.
apiVersion: apps/v1 kind: Deployment metadata: name: kibana namespace: logging labels: app: kibana spec: replicas: 1 selector: matchLabels: app: kibana template: metadata: labels: app: kibana spec: containers: - name: kibana image: docker.elastic.co/kibana/kibana:7.9.3 resources: limits: cpu: 1000m memory: 4Gi requests: cpu: 1000m memory: 4Gi env: - name: ELASTICSEARCH_URL value: http://elasticsearch:9200 ports: - containerPort: 5601
Because we created a service for the Elasticsearch cluster, we can tell Kibana what is the address of this cluster at the ENV section and make it connect to it upon boot. DNS for the service is just its name: http://elasticsearch:9200. You can read more about how DNS is resolved in Kubernetes here.
Same as with the elastic cluster, we want to have a simple way to expose communication to our Kibana pod. To do that, we’ll create another service.
apiVersion: v1 kind: Service metadata: name: kibana namespace: logging labels: app: kibana spec: ports: - port: 5601 selector: app: kibana
This service is for our convenience and has no operation effect on the cluster. We will use kubectl CLI to create a tunnel to our Kibana pod (so we won’t have to open it to the public internet). One option is to just find the pod name and use it for our command. But we could easily create ourselves a shortcut using a static Service name and proxy traffic to it.
To create a tunnel using our Kibana service, run this:
kubectl -n logging port-forward service/kibana 5601
Now you can point your browser to http://localhost:5601 and you should have the Kibana UI ready for use.
Filebeat
Now that we have Elasticsearch cluster ready to ingest our data, and Kibana ready to show it to us, we can start writing our logs and metrics. We’ll start with the logs.
In Kubernetes, each pod consists of 1 or more containers, each container stdout/stderr streams are automatically written to a file at the hosting node. So, if we have access to the host storage, we can read these files which are basically the same as looking at the containers stdout/stderr streams. These files are stored, in most cases, at the host machine under this path: /var/log/containers/*.log (* = container id).
So the plan is to run a pod on each node, that will mount the log directory of the host, tail all the files, and write it to elastic. To do all of that we’ll use Filebeat. Filebeat is the elastic stack log shipper. It can tail our files, and with the right processor even enrich our logs with some Kubernetes metadata, then, write it all to our elastic cluster.
First, let’s create the configuration file for Filebeat using a ConfigMap:
apiVersion: v1 kind: ConfigMap metadata: name: filebeat-config namespace: logging labels: k8s-app: filebeat data: filebeat.yml: |- filebeat.inputs: - type: container paths: - /var/log/containers/*.log processors: - add_kubernetes_metadata: host: ${NODE_NAME} matchers: - logs_path: logs_path: "/var/log/containers/" processors: - add_cloud_metadata: - add_host_metadata: output.elasticsearch: hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
This config will be mounted as filebeat.yml file to the pod. It tells filebeat what files to tail, what processors to run, and where to write its data.
To run a single pod on each node we’ll use DaemonSet:
apiVersion: apps/v1 kind: DaemonSet metadata: name: filebeat namespace: logging labels: k8s-app: filebeat spec: selector: matchLabels: k8s-app: filebeat template: metadata: labels: k8s-app: filebeat spec: serviceAccountName: filebeat terminationGracePeriodSeconds: 30 hostNetwork: true dnsPolicy: ClusterFirstWithHostNet containers: - name: filebeat image: docker.elastic.co/beats/filebeat:7.9.3 args: [ "-c", "/etc/filebeat.yml", "-e", ] env: - name: ELASTICSEARCH_HOST value: elasticsearch - name: ELASTICSEARCH_PORT value: "9200" - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName securityContext: runAsUser: 0 resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/filebeat.yml readOnly: true subPath: filebeat.yml - name: data mountPath: /usr/share/filebeat/data - name: varlibdockercontainers mountPath: /var/lib/docker/containers readOnly: true - name: varlog mountPath: /var/log readOnly: true volumes: - name: config configMap: defaultMode: 0640 name: filebeat-config - name: varlibdockercontainers hostPath: path: /var/lib/docker/containers - name: varlog hostPath: path: /var/log - name: data hostPath: path: /var/lib/filebeat-data type: DirectoryOrCreate
Each pod in this set will mount the configuration file from the ConfigMap we created a step before, mount it’s host node /var/log and /var/lib/docker/containers directories as it’s own (to tail), and run Filebeat with our elastic cluster configured as output (address from ENV).
If all goes well, we’ll have a single Filebeat container, running on each node of our Kubernetes cluster, and logs from all containers are flowing to our Elastic cluster.
Metricbeats
We have Filebeat running as a log shipper, now we need a metric shipper. Here comes Metricbeat – a metric shipper from Elastic.
Metricbeat will help us ship metrics from our host nodes and running pods to Elasticsearch. It will do it using 2 methods. 1) running in the cluster alongside kube-state-metrics (will be explained), reading, and publishing its metrics. 2) running on each node, reading the node (host) stats, and publishing them. For the first one, we’ll have a single pod, for the second, a DaemonSet (just like Filebeat).
First let’s create the DameonSet configuration file:
apiVersion: v1 kind: ConfigMap metadata: name: metricbeat-daemonset-config namespace: logging labels: k8s-app: metricbeat data: metricbeat.yml: |- metricbeat.config.modules: # Mounted `metricbeat-daemonset-modules` configmap: path: ${path.config}/modules.d/*.yml # Reload module configs as they change: reload.enabled: false processors: - add_cloud_metadata: output.elasticsearch: hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}'] --- apiVersion: v1 kind: ConfigMap metadata: name: metricbeat-daemonset-modules namespace: logging labels: k8s-app: metricbeat data: system.yml: |- - module: system period: 10s metricsets: - cpu - load - memory - network - process - process_summary - core - diskio - socket processes: ['.*'] process.include_top_n: by_cpu: 5 # include top 5 processes by CPU by_memory: 5 # include top 5 processes by memory - module: system period: 1m metricsets: - filesystem - fsstat processors: - drop_event.when.regexp: system.filesystem.mount_point: '^/(sys|cgroup|proc|dev|etc|host|lib|snap)($|/)' kubernetes.yml: |- - module: kubernetes metricsets: - node - system - pod - container - volume period: 10s host: ${NODE_NAME} hosts: ["https://${NODE_NAME}:10250"] bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token ssl.verification_mode: "none" # If there is a CA bundle that contains the issuer of the certificate used in the Kubelet API, # remove ssl.verification_mode entry and use the CA, for instance: #ssl.certificate_authorities: #- /var/run/secrets/kubernetes.io/serviceaccount/service-ca.crt # Currently `proxy` metricset is not supported on Openshift, comment out section - module: kubernetes metricsets: - proxy period: 10s host: ${NODE_NAME} hosts: ["localhost:10249"]
Now, create the DaemonSet:
apiVersion: apps/v1 kind: DaemonSet metadata: name: metricbeat namespace: logging labels: k8s-app: metricbeat spec: selector: matchLabels: k8s-app: metricbeat template: metadata: labels: k8s-app: metricbeat spec: serviceAccountName: metricbeat terminationGracePeriodSeconds: 30 hostNetwork: true dnsPolicy: ClusterFirstWithHostNet containers: - name: metricbeat image: docker.elastic.co/beats/metricbeat:7.9.3 args: [ "-c", "/etc/metricbeat.yml", "-e", "-system.hostfs=/hostfs", ] env: - name: ELASTICSEARCH_HOST value: elasticsearch - name: ELASTICSEARCH_PORT value: "9200" - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName securityContext: runAsUser: 0 resources: limits: memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/metricbeat.yml readOnly: true subPath: metricbeat.yml - name: data mountPath: /usr/share/metricbeat/data - name: modules mountPath: /usr/share/metricbeat/modules.d readOnly: true - name: proc mountPath: /hostfs/proc readOnly: true - name: cgroup mountPath: /hostfs/sys/fs/cgroup readOnly: true volumes: - name: proc hostPath: path: /proc - name: cgroup hostPath: path: /sys/fs/cgroup - name: config configMap: defaultMode: 0640 name: metricbeat-daemonset-config - name: modules configMap: defaultMode: 0640 name: metricbeat-daemonset-modules - name: data hostPath: # When metricbeat runs as non-root user, this directory needs to be writable by group (g+w) path: /var/lib/metricbeat-data type: DirectoryOrCreate
These pods will be responsible for running on all nodes and reading their stats about CPU, memory, network, disk, and more.
On top of that, we need another dedicated pod that will read and publish metrics for our workloads CPU, memory disk, etc… For that, we’ll get the help of kube-state-metrics. This official Kubernetes service will run inside the cluster, read all metrics from the Kubernetes API server, and expose them to Metricbeat in a format it can understand. To deploy kube-state-metrics just run the following command from the repo root:
kubectl apply -f ./kube-state-metrics
Now Let’s deploy the ConfigMap for the Metricbeat pod:
apiVersion: v1 kind: ConfigMap metadata: name: metricbeat-deployment-config namespace: logging labels: k8s-app: metricbeat data: metricbeat.yml: |- metricbeat.config.modules: # Mounted `metricbeat-daemonset-modules` configmap: path: ${path.config}/modules.d/*.yml # Reload module configs as they change: reload.enabled: false processors: - add_cloud_metadata: output.elasticsearch: hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}'] --- apiVersion: v1 kind: ConfigMap metadata: name: metricbeat-deployment-modules namespace: logging labels: k8s-app: metricbeat data: # This module requires `kube-state-metrics` up and running under `kube-system` namespace kubernetes.yml: |- - module: kubernetes metricsets: - state_node - state_deployment - state_replicaset - state_pod - state_container - state_cronjob - state_resourcequota period: 10s host: ${NODE_NAME} hosts: ["kube-state-metrics.kube-system.svc.cluster.local:8080"]
And the Deployment:
# Deploy singleton instance in the whole cluster for some unique data sources, like kube-state-metrics apiVersion: apps/v1 kind: Deployment metadata: name: metricbeat namespace: logging labels: k8s-app: metricbeat spec: selector: matchLabels: k8s-app: metricbeat template: metadata: labels: k8s-app: metricbeat spec: serviceAccountName: metricbeat hostNetwork: true dnsPolicy: ClusterFirstWithHostNet containers: - name: metricbeat image: docker.elastic.co/beats/metricbeat:7.9.3 args: [ "-c", "/etc/metricbeat.yml", "-e", ] env: - name: ELASTICSEARCH_HOST value: elasticsearch - name: ELASTICSEARCH_PORT value: "9200" - name: NODE_NAME valueFrom: fieldRef: fieldPath: spec.nodeName securityContext: runAsUser: 0 resources: limits: cpu: 100m memory: 200Mi requests: cpu: 100m memory: 100Mi volumeMounts: - name: config mountPath: /etc/metricbeat.yml readOnly: true subPath: metricbeat.yml - name: modules mountPath: /usr/share/metricbeat/modules.d readOnly: true volumes: - name: config configMap: defaultMode: 0640 name: metricbeat-deployment-config - name: modules configMap: defaultMode: 0640 name: metricbeat-deployment-modules
If all is set up correctly, you’ll have a pod on each node that will read and publish the nodes metrics, and another pod that will run against the kube-state-metrics server and publishes the Pods stats.
Conclusion
Elastic has a good suite of services that can provide us with an end-to-end solution for metrics and logs. Deploying this stack is quick and easy and provides good coverage for our cluster observability. Although the Elastic stack is great, there are a lot of other solutions out there that can give you the same end result. It is worth to check them out and choose the best one for our requirements.
And one last semi-warning I like to add when talking about any drop-in solution for Kubernetes. Kubernetes is a fairly complicated system. Services are constantly moving and changing, there are a lot of internal dependencies, microservices, APIs, and there are different setups that can be done (self-managed on the cloud, fully managed, on-premises). Having a drop-in solution (of any kind) is nice and can save us a lot of time, but, we should not forget that behind these few YAMLs a lot of work has been done. When deploying a solution on a production cluster, we should get to know the internals, how stuff works and what are the bits and bytes of our solution in order to be able to support and maintain it.