Getting started
Over the course of this series, you will be taking a look at:
- How to set up a Kubernetes cluster and deploy a sample workload to generate telemetry log data
- How to create a telemetry pipeline with Fluent Bit and Chronosphere. Fluent Bit is an open-source telemetry pipeline that can read, transport, and transform logs, metrics, and traces. Chronosphere is an open-source-based observability platform.
- Learn how to control your log volumes’ size. Learn what transformations are. Learn how to deduplicate records. Learn how to aggregate records.
Deploying a cluster and a workload
Developers today have the luxury of living in an ephemeral world as they tend to assume the cloud is someone else’s infrastructure and not their problem. Once developers start dumping bytes into an object bucket in the cloud, there is no going back. While the cloud provides simplicity and rapid scalability, there is an all-too-real risk of overshooting your budget as data proliferates across your infrastructure.
Don’t believe this could happen to you? Read along to find out more about these challenges that we can tackle with a log management strategy.
This blog is the first in a series of practical how-to’s, aiming to guide the reader through the process of creating a log management strategy. The Cloud Native Computing Foundation (CNCF)’s observability landscape today contains over 100 log management tools and is perhaps guilty of inducing decision fatigue. The advantages of log management cannot be overstated, however. From drawing meaningful insights to staying on top of outages, logs can go a long way when it comes to keeping things running smooth.
Let’s get started.
Understanding log management for Kubernetes
Telemetry data can be generated in our cloud native infrastructure in the form of metrics, logs, events, and traces. For this article, we want to narrow our focus to just the logs. Logging is a fundamental construct that is important in building reliable systems.
Logging offers the potential to:
- Discover, predict, and remediate problems
- Learn more about workloads and offer insights into improving them
There is no such thing as a free lunch, however. There are some important caveats to remember:
- Logs are a fundamental telemetry data type, but tend to be verbose in volume.
- Logs can easily proliferate as cloud infrastructure auto-scales, burning up budgets.
- Logs contain a lot of not so useful information, slowing incident responses.
When it comes to log management for Kubernetes workloads, there are certain challenges to be mindful of:
Retaining logs in the cloud: Serverless architectures leverage cloud service providers and simplify managing a cluster’s lifecycle. This, however, introduces complexity in other areas such as, you guessed it: log collection and retention. When things go wrong, pods are deleted and re-deployed in a repeated manner. As a result, associated log files often get deleted and lost forever.
Kubernetes is layered: Depending on the environment, log management for Kubernetes will encompass a subset of:
- Platform: How is the cluster performing overall? Are all nodes healthy? Are there any limits or resource constraints at all?
- Infrastructure: Are the servers/VMs running a cluster looking healthy? This typically applies to self-managed clusters on-premises, as well as clusters deployed on VMs on the cloud.
- Workload: Workloads typically consist of multiple components, such as a web frontend and a database backend.
With all of this in mind, let’s set up an example demo environment to explore a log management strategy.
Setting up a demo environment
The goal here is to deploy a log generating workload and this is what we are going to be creating as our demo environment:
- A four node Kubernetes cluster. There are several ways to go about this. This article uses kind for its simplicity and ability to use on a local machine.
- Podman as the container runtime. Podman provides an installer for most common operating systems, making installs convenient.
- Helm is used to deploy the workload we will be using to generate log telemetry data.
- Ghost is a content management system (CMS) that will be our workload for generating log telemetry data.
- Kubectl tooling for command line interactions with our Kubernetes cluster and its workloads.
The Logs Control Install Demo repository contains the project to install the project as described across this series. This project installation script will automatically verify all requirements and stop if they are not met, suggesting how to rectify the missing dependency.
Prerequisites:
- Podman 5.x or higher installed with a podman machine started.
- Kind 0.27.x or higher installed.
- Helm 3.x or higher installed.
- Kubectl 1.32.x or higher installed.
- An active connection to the internet. This is required to download binaries and container images.
Downloading and installing
- Download and unzip the logs control install demo project.
- Run ‘init.sh’ from the project directory to create a Kubernetes cluster using Podman container platform and install the Ghost blogging platform to generate logs.
$ ./init.sh
…
Checking that podman machine is running...
Podman machine is not running. Please make sure you have initialized and started the
Podman machine as follows and rerun this installation script again. If possible add
extra memory to a new Podman virtual machine as follows:
$ podman machine init --memory 4096
$ podman machine start
The installation script kicks off with some verification checks to ensure the proper tooling and minimal versions are installed. In the case shown above, we detect a failed attempt to start the podman machine. We are told to create a new virtual machine with some extra memory and start it before rerunning the installation again. This time the installation continues onwards to verify that the path is clear for installation of a new four node cluster named 4node.
…
Checking that podman machine is running...
Checking if kubectl is installed...
Checking for kubectl version
Starting Kubernetes cluster deployment...
- removing existing installation directory...
Creating directories for Kubernetes control and worker nodes…
Removing any previous k8s cluster named 4node...
using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Deleting cluster "4node" ...
…
After cleaning up any pre-existing demo installations, we see the the new cluster named 4node is created as follows:
Starting the new cluster named 4node...
using podman due to KIND_EXPERIMENTAL_PROVIDER
enabling experimental podman provider
Creating cluster "4node" ...
✓ Ensuring node image (kindest/node:v1.32.2) 🖼
✓ Preparing nodes 📦 📦 📦 📦
✓ Writing configuration 📜
✓ Starting control-plane 🕹️
✓ Installing CNI 🔌
✓ Installing StorageClass 💾
✓ Joining worker nodes 🚜
Set kubectl context to "kind-4node"
You can now use your cluster with:
kubectl cluster-info --context kind-4node
Have a nice day! 👋
Kubernetes cluster creation successful…
…
Be aware that the very first time you create this cluster, it needs to download the node image. It will therefore pause on that line in the console output until the download completes. Any subsequent installations of the cluster will be created very quickly with that node image cached.
A file called 4nodeconfig.yaml is then created with the output from the cluster’s configuration, shown to us in the console. This file is for use by the kubectl commands in the rest of the installation and available at support/4nodeconfig.yaml.
…
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: DATA+OMITTED
server: https://127.0.0.1:54021
name: kind-4node
contexts:
- context:
cluster: kind-4node
user: kind-4node
name: kind-4node
current-context: kind-4node
kind: Config
preferences: {}
users:
- name: kind-4node
user:
client-certificate-data: DATA+OMITTED
client-key-data: DATA+OMITTED
…
Now that we have a cluster to install a workload on, the next steps are setting the stage for the Ghost CMS installation. We create the persistent volumes that are mounted on the cluster for the CMS database to write data to and then create the ghost namespace where our workload will live.
…
Creating persistent volumes for Ghost workload...
persistentvolume/ghost-content-volume created
persistentvolume/ghost-database-volume created
Creating namespace for Ghost workload...
namespace/ghost created
…
To avoid any waiting when using the Ghost helm chart, we install it in our helm registry before deploying it. This deployment takes something like two or three minutes to fully complete and we are presented with a request to wait for that to happen. In the background a kubectl command is running that is holding the installation up waiting on the two workload pods to fully spin up.
Once the pods are up and running, we are shown that their wait conditions have been met and the status is displayed. Port forwarding is applied so that the VM pods can be reached from our local machine browser and the CMS credentials are collected by querying the running CMS workload.
…
Deploying Ghost workload with Helm chart...
Confirming deployments have completed successfully on the cluster... it takes a few minutes...
pod/ghost-dep-c594d5cb-gws8k condition met
pod/ghost-dep-mysql-0 condition met
The status of all Ghost pods in the cluster are:
NAME READY STATUS RESTARTS AGE
ghost-dep-c594d5cb-gws8k 1/1 Running 0 2m35s
ghost-dep-mysql-0 1/1 Running 0 2m35s
Forwarding the necessary Ghost port 2368 for UI...
Retrieving username and password for Ghost platform login...
…
All of this information is then presented to us in the console before the installation is completed.
…
======================================================
= =
= Install complete, Ghost platform available on =
= a Kubernetes 4 node cluster generating logs. =
= =
= The Ghost blog platform can be viewed at: =
= =
= http://localhost:2368 =
= http://localhost:2368/ghost (admin panel) =
= =
= Username: [email protected] =
= Password: lNzaV6tOHX =
= =
= Note: When finished using the demo setup, =
= you can shut it down and come back later to =
= restart it: =
= =
= $ podman machine stop =
= =
= To clean up the cluster entirely either rerun =
= the installation script to cleanup existing =
= cluster and reinstall a new Ghost platform, or =
= run the following command: =
= =
= $ kind --name=4node delete cluster =
= =
======================================================
If we look at the init.sh script we find variables at the top to set credentials if so desired for the CMS installation. Our installation has just set defaults.
- USERNAME: This should be set to the desired username. It is set to “adminuser” if not modified.
- EMAIL_ADDRESS: This should be set to an accessible email address. This parameter is used when logging into the Ghost interface. It is set to “[email protected]” if not modified.
The final verification is to open the Ghost application in your browser at http://localhost:2368 and you should see:
For completeness, we can access the admin panel at http://localhost:2368/admin.
What’s next
We now have an active workload running a customizable blog. To make the rest of this series more interesting we suggest creating a few blog entries. This ensures more engaging data when we start working on our logs control strategy in the rest of this series.
At this point, we have running on our local machine a four node Kubernetes cluster and running a Ghost CMS as a containerized workload. Up next, we continue our journey by creating a telemetry pipeline to stream Ghost’s logs. Stay tuned!
Ready to learn more?
Watch the webinar: Logs: Love ’em, don’t leave ’em