Building a Kubernetes log management strategy

A document icon with the word "LOGS" is centered in a gray circular area over a background of a fluent, white, curving tunnel structure.
ACF Image Blog

In this first blog of our how-to series on log management for Kubernetes, learn the current state of today’s workloads and how to build a cluster and deploy a sample workload.

A smiling man, Eric D. Schabell, wears glasses and a dark sweater against a plain background.
Eric D. Schabell | Director of Technical Marketing and Evangelism | Chronosphere

Eric is Chronosphere’s Director of Technical Marketing and Evangelism. He’s renowned in the development community as a speaker, lecturer, author, and baseball expert. His current role allows him to help the world understand the challenges they are facing with cloud native observability. He brings a unique perspective to the stage with a professional life dedicated to sharing his deep expertise of open source technologies, organizations, and is a CNCF Ambassador.

9 MINS READ

Getting started

Over the course of this series, you will be taking a look at:

  • How to set up a Kubernetes cluster and deploy a sample workload to generate telemetry log data 
  • How to create a telemetry pipeline with Fluent Bit and Chronosphere. Fluent Bit is an open-source telemetry pipeline that can read, transport, and transform logs, metrics, and traces. Chronosphere is an open-source-based observability platform.
  • Learn how to control your log volumes’ size. Learn what transformations are. Learn how to deduplicate records. Learn how to aggregate records.

Deploying a cluster and a workload

Developers today have the luxury of living in an ephemeral world as they tend to assume the cloud is someone else’s infrastructure and not their problem. Once developers start dumping bytes into an object bucket in the cloud, there is no going back. While the cloud provides simplicity and rapid scalability, there is an all-too-real risk of overshooting your budget as  data proliferates across your infrastructure.

Don’t believe this could happen to you? Read along to find out more about these challenges that we can tackle with a log management strategy.

This blog is the first in a series of practical how-to’s, aiming to guide the reader through the process of creating a log management strategy. The Cloud Native Computing Foundation (CNCF)’s observability landscape today contains over 100 log management tools and is perhaps guilty of inducing decision fatigue. The advantages of log management cannot be overstated, however. From drawing meaningful insights to staying on top of outages, logs can go a long way when it comes to keeping things running smooth.

Let’s get started.

Understanding log management for Kubernetes

Telemetry data can be generated in our cloud native infrastructure in the form of metrics, logs, events, and traces. For this article, we want to narrow our focus to just the logs. Logging is a fundamental construct that is important in building reliable systems.

Logging offers the potential to:

  • Discover, predict, and remediate problems
  • Learn more about workloads and offer insights into improving them

There is no such thing as a free lunch, however. There are some important caveats to remember:

  • Logs are a fundamental telemetry data type, but tend to be verbose in volume.
  • Logs can easily proliferate as cloud infrastructure auto-scales, burning up budgets.
  • Logs contain a lot of not so useful information, slowing incident responses.

When it comes to log management for Kubernetes workloads, there are certain challenges to be mindful of:

Retaining logs in the cloud: Serverless architectures leverage cloud service providers and simplify managing a cluster’s lifecycle. This, however, introduces complexity in other areas such as, you guessed it: log collection and retention. When things go wrong, pods are deleted and re-deployed in a repeated manner. As a result, associated log files often get deleted and lost forever.

Kubernetes is layered: Depending on the environment, log management for Kubernetes will encompass a subset of:

  • Platform: How is the cluster performing overall? Are all nodes healthy? Are there any limits or resource constraints at all?
  • Infrastructure: Are the servers/VMs running a cluster looking healthy? This typically applies to self-managed clusters on-premises, as well as clusters deployed on VMs on the cloud.
  • Workload: Workloads typically consist of multiple components, such as a web frontend and a database backend.

With all of this in mind, let’s set up an example demo environment to explore a log management strategy.

Whitepaper: Getting Started with Fluent Bit and OSS Telemetry Pipelines

Getting Started with Fluent Bit and OSS Telemetry Pipelines: Learn how to navigate the complexities of telemetry pipelines with Fluent Bit.

Setting up a demo environment

The goal here is to deploy a log generating workload and this is what we are going to be creating as our demo environment:

  • A two node Kubernetes cluster. There are several ways to go about this. This article uses kind for its simplicity and ability to use on a local machine.
  • Podman as the container runtime. Podman provides an installer for most common operating systems, making installs convenient.
  • Helm is used to deploy the workload we will be using to generate log telemetry data.
  • Ghost is a content management system (CMS) that will be our workload for generating log telemetry data.
  • Kubectl tooling for command line interactions with our Kubernetes cluster and its workloads.

The Logs Control Install Demo repository contains the project to install the project as described across this series. This project installation script will automatically verify all requirements and stop if they are not met, suggesting how to rectify the missing dependency. 

Prerequisites:

  • Podman 5.x or higher installed with a podman machine started.
  • Kind 0.27.x or higher installed.
  • Helm 3.x or higher installed.
  • Kubectl 1.32.x or higher installed.
  • An active connection to the internet. This is required to download binaries and container images.

Downloading and installing 

  1. Download and unzip the logs control install demo project.
  2. Run ‘init.sh’ from the project directory to create a Kubernetes cluster using Podman container platform and install the Ghost blogging platform to generate logs.
[INFO]    Checking that podman machine is running...

[ERROR]    ========================================================
[ERROR]                                                                            
[ERROR]    Podman machine is not running. Please make sure you have initialized and 
[ERROR]    started the Podman machine as follows and rerun this installation script 
[ERROR]    again. If possible add extra memory to a new Podman virtual machine as   
[ERROR]    follows:                                                                 
[ERROR]                                                                            
[ERROR]     $ podman machine init --memory 4096                                     
[ERROR]                                                                              
[ERROR]     $ podman machine start                                                  
[ERROR]                                                                              
[ERROR]    =========================================================

The installation script kicks off with some verification checks to ensure the proper tooling and minimal versions are installed. In the case shown above, we detect a failed attempt to start the podman machine. We are told to create a new virtual machine with some extra memory and start it before rerunning the installation again. This time the installation continues onwards to verify that the path is clear for installation of a new four node cluster named 4node.

…
[INFO]    Checking that podman machine is running...
[INFO]    Checking if kubectl is installed...
[INFO]    Checking for kubectl version...
[INFO]    installed kubectl version is v1.32...
[INFO]    Starting Kubernetes cluster deployment...
[INFO]    removing existing installation directory...
[INFO]    Creating directories for Kubernetes control and worker nodes...
[INFO]    Removing any previous k8s cluster named 2node:

using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Deleting cluster "2node" 

After cleaning up any pre-existing demo installations, we see that the new cluster is created as follows:

[INFO]    Creating a new cluster named 2node:

using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Creating cluster "2node" ...
 ✓ Ensuring node image (kindest/node:v1.32.2) 🖼
 ✓ Preparing nodes 📦 📦 
 ✓ Writing configuration 📜
 ✓ Starting control-plane 🕹️
 ✓ Installing CNI 🔌
 ✓ Installing StorageClass 💾
 ✓ Joining worker nodes 🚜
Set kubectl context to "kind-2node"
You can now use your cluster with:

kubectl cluster-info --context kind-2node

Have a nice day! 👋
Kubernetes cluster creation successful…
…

Be aware that the very first time you create this cluster, it needs to download the node image. It will therefore pause on that line in the console output until the download completes. Any subsequent installations of the cluster will be created very quickly with that node image cached.

Now that we have a cluster to install a workload on, the next steps are setting the stage for the Ghost CMS installation. We create the persistent volumes that are mounted on the cluster for the CMS database to write data to and then create the ghost namespace where our workload will live.

…
[INFO]    Setting context to use kind-2node...
[INFO]    Creating persistent volumes for Ghost workload...
[INFO]    Adding Ghost helm chart to our local repo...

…

To avoid any waiting when using the Ghost helm chart, we install it in our helm registry before deploying it. This deployment takes something like two or three minutes to fully complete and we are presented with a request to wait for that to happen. In the background a kubectl command is running that is holding the installation up waiting on the two workload pods to fully spin up.

Once the pods are up and running, we are shown that their wait conditions have been met and the status is displayed. Port forwarding is applied so that the VM pods can be reached from our local machine browser and the CMS credentials are collected by querying the running CMS workload.

…
[INFO]    Deploying Ghost workload with Helm chart...
[INFO]    Confirming deployments have completed on the cluster... takes a few minutes...
[INFO]    Get a coffee or tea while you wait...

pod/ghost-dep-59476db967-bdlf7 condition met
pod/ghost-dep-mysql-0 condition met

[INFO]    The status of all Ghost pods in the cluster are:

NAME                         READY   STATUS    RESTARTS   AGE
ghost-dep-59476db967-bdlf7   1/1     Running   0          76s
ghost-dep-mysql-0            1/1     Running   0          76s

[INFO]    Forwarding the necessary Ghost port 2368 for UI...
[INFO]    Retrieving username and password for Ghost platform login...

…

All of this information is then presented to us in the console before the installation is completed*.

*Note: Astute readers will notice that we skipped a small section of the output that shows some activity around Fluent Bit. This will be discussed in the second part of this series, so just ignore that output for now.

…
[INFO]    ======================================================
[INFO]    =                                                    =
[INFO]    =  Install complete, Ghost platform available on     =
[INFO]    =  a Kubernetes 2 node cluster generating logs.      =
[INFO]    =                                                    =
[INFO]    =  The Ghost blog platform can be viewed at:         =
[INFO]    =                                                    =
[INFO]    =    http://localhost:2368                           =
[INFO]    =    http://localhost:2368/ghost  (admin panel)      =
[INFO]    =                                                    =
[INFO]    =    Username: [email protected]                     =
[INFO]    =    Password: wpkt7YyOPZ                            =
[INFO]    =                                                    =
[INFO]    =  Note: When finished using the demo setup,         =
[INFO]    =  you can shut it down and come back later to       =
[INFO]    =  restart it:                                       =
[INFO]    =                                                    =
[INFO]    =    $ podman machine stop                           =
[INFO]    =                                                    =
[INFO]    =  To clean up the cluster entirely either rerun     =
[INFO]    =  the installation script to cleanup existing       =
[INFO]    =  cluster and reinstall a new Ghost platform, or    =
[INFO]    =  run the following command:                        =
[INFO]    =                                                    =
[INFO]    =    $ kind --name=2node delete cluster              =
[INFO]    =                                                    =
[INFO]    ======================================================

If we look at the init.sh script we find variables at the top to set credentials if so desired for the CMS installation. Our installation has just set defaults.

  1. USERNAME: This should be set to the desired username. It is set to “adminuser” if not modified.
  2. EMAIL_ADDRESS: This should be set to an accessible email address. This parameter is used when logging into the Ghost interface. It is set to “[email protected]” if not modified.

The final verification is to open the Ghost application in your browser at http://localhost:2368 and you should see:

Image Alt Text

For completeness, we can access the admin panel at http://localhost:2368/admin.

What’s next

At this point, we have running on our local machine a two node Kubernetes cluster and running a Ghost CMS as a containerized workload. Up next, we continue our journey by creating a telemetry pipeline to stream Ghost’s logs.

Buyer’s Guide: Telemetry Pipelines

Build a smarter telemetry pipeline. Download The Buyer’s Guide to Telemetry Pipelines

Share This: