Tackling log management strategy challenges
Developers today have the luxury of living in an ephemeral world as they tend to assume the cloud is someone else’s infrastructure and not their problem. Once developers start dumping bytes into an object bucket in the cloud, there is no going back. While the cloud provides simplicity and rapid scalability, there is an all-too-real risk of overshooting your budget as data proliferates across your infrastructure. Don’t believe this could happen to you? Read along to find out more about these challenges that we can tackle with a log management strategy.
What you’ll learn by reading this blog series
This blog is the second in a series of practical how-to’s, aiming to guide the reader through the process of creating a log management strategy. In the first part, we installed a demo environment on our local machine with a two node Kubernetes cluster and a Ghost CMS as the containerized workload generating log telemetry data.
In this blog, we will continue to expand our demo environment by installing Fluent Bit. Fluent Bit is an open source telemetry pipeline that can read, transport, and transform logs, metrics, and traces.
Still to come in this series:
- Learning how to start controlling your log volumes with the tooling Fluent Bit provide
- Learning how to create a telemetry pipeline with Fluent Bit and Chronosphere. Fluent Bit is an open source telemetry pipeline that can read, transport, and transform logs, metrics, and traces. Chronosphere is an open source-based observability platform.
Let’s get started.
Logs in demo environment
In the first article of this series you installed a Kubernetes cluster with a content management system (CMS) in the form of a blogging platform. What also was installed for you automatically, but not yet covered in detail, was the addition of a Fluent Bit instance. This instance is installed to steam or collect all logs from the entire Kubernetes cluster, from control nodes, worker nodes, and all container workloads.
Looking at the installation project output in the console towards the end, you see this happening in the following section of the installation.
…
[INFO] Adding Fluent Bit Helm chart to our local repo...
[INFO] Setting up Fluent Bit logging for containers on the cluster:
[INFO] Setting cluster content to logging namespace...
[INFO] Confirming logging deployment completed successfully on the cluster:
pod/fluent-bit-7qmng condition met
[INFO] The status of all Fluent Bit pods in the cluster are:
NAME READY STATUS RESTARTS AGE
fluent-bit-7qmng 1/1 Running 0 7s
…
You can see that Fluent Bit is installed using a Helm chart with some adjusted variables. If you are curious, inspect the values that have been adjusted in the fluentbit-helm.yaml file found in the support directory.
To validate the log collection is working with our Fluent Bit instance, you can inspect its collection results in the Fluent Bit pod logs as follows.
# display Fluent Bit collected logs.
#
$ kubectl --kubeconfig target/2nodeconfig.yaml logs fluent-bit-7qmng --namespace logging
…
[CUT_LONG_LOG_OUTPUT_HERE]
[64]kube.var.log.containers.ghost-dep-mysql-0_ghost_mysql-a9561bbcd13fa9693c1800d1d9d7d45c598eee35a5475d2a372a71e3c8b2eca3.log: [[1743406540.172325970, {}], {"time"=>"2025-03-31T07:35:40.17232597Z", "stream"=>"stdout", "_p"=>"F", "log"=>"2025-03-31T07:35:40.172158Z 0 [System] [MY-010931] [Server] /opt/bitnami/mysql/bin/mysqld: ready for connections. Version: '8.4.0' socket: '/opt/bitnami/mysql/tmp/mysql.sock' port: 3306 Source distribution."}]
In the log output you can see that the configuration of our Fluent Bit instance is such that it is only collecting logs with a reference to the word ‘ghost’, as we are only interested in our Ghost CMS workload logging messages. This is done by using the tail input plugin and the output plugin defined to just print all log messages to the console.
…
pipeline:
inputs:
- name: tail
tag: kube.*
read_from_head: true
#path: /var/log/containers/*.log
path: /var/log/containers/*ghost*
multiline.parser: docker, cri
outputs:
- name: stdout
match: '*'
…
Included for reference, but commented out, is the line pointing to all container log files collected by Kubernetes on the path /var/log/containers/*.log. If this is turned on, the log output you view will include logs from the control node, worker node, and anything else installed or auto scaled up in this Kubernetes cluster.
Manning Book: Fluent Bit with Kubernetes
Learn how to optimize observability systems for Kubernetes. Download Fluent Bit with Kubernetes now!
Getting control of logs
The next step in your log management strategy is to start taking some control over the logs you are collecting with your Fluent Bit instance. For this example, you have a small set log file from our Ghost CMS workload that is automatically reloaded on every change we implement to ensure you have consistent log messages to learn from. In real world scenarios, these logs would be dynamically generated based on the workload settings and usage by system users.
The first step will be to parse our logs into machine readable format, a standard known as JSON. This is done by adjusting the output plugin parameters in the file support/fluentbit-helm.yaml as shown below.
…
outputs:
- name: stdout
match: '*'
format: json_lines
…
You can then update your Fluent Bit with this new configuration using the following Helm upgrade command, noting that this creates a new Fluent Bit pod name.
# Update Fluent Bit with new configuration changes.
#
$ helm upgrade --kubeconfig target/2nodeconfig.yaml --install fluent-bit fluent/fluent-bit --set image.tag=3.2.9 --namespace=logging --create-namespace --values=support/fluentbit-helm.yaml'
Release "fluent-bit" has been upgraded. Happy Helming!
NAME: fluent-bit
LAST DEPLOYED: Mon Mar 31 09:02:01 2025
NAMESPACE: logging
STATUS: deployed
REVISION: 3
NOTES:
Get Fluent Bit build information by running these commands:
export POD_NAME=$(kubectl get pods --namespace logging -l "app.kubernetes.io/name=fluent-bit,app.kubernetes.io/instance=fluent-bit" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace logging port-forward $POD_NAME 2020:2020
curl http://127.0.0.1:2020
# determine the Fluent Bit pod name.
#
$ kubectl --kubeconfig target/2nodeconfig.yaml get pods --namespace logging
NAME READY STATUS RESTARTS AGE
fluent-bit-zs5qk 1/1 Running 0 39s
To verify the logs are now being collected and parsed into the JSON format, you run the following command.
# display Fluent Bit collected logs in JSON.
#
$ kubectl --kubeconfig target/2nodeconfig.yaml logs fluent-bit-zs5qk --namespace logging
…
[CUT_LONG_LOG_OUTPUT_HERE]
{"date":1743406540.172326,"time":"2025-03-31T07:35:40.17232597Z","stream":"stdout","_p":"F","log":"2025-03-31T07:35:40.172158Z 0 [System] [MY-010931] [Server] /opt/bitnami/mysql/bin/mysqld: ready for connections. Version: '8.4.0' socket: '/opt/bitnami/mysql/tmp/mysql.sock' port: 3306 Source distribution."}
You can see that the original last log line is now parsed into JSON, with key – value pairs instead of just log text lines.
You can adjust the timestamp used for the dates in your logs, turning that into a more readable format with the following adjustment to the Fluent Bit output configuration and update our instance. This is done by adjusting the output plugin parameters in the file support/fluentbit-helm.yaml as shown below.
…
outputs:
- name: stdout
match: '*'
format: json_lines
json_date_format: java_sql_timestamp
…
You can then update your Fluent Bit using the same Helm upgrade command as before.
# Update Fluent Bit with new configuration changes.
#
$ helm upgrade --kubeconfig target/2nodeconfig.yaml --install fluent-bit fluent/fluent-bit --set image.tag=3.2.9 --namespace=logging --create-namespace --values=support/fluentbit-helm.yaml'
Release "fluent-bit" has been upgraded. Happy Helming!
NAME: fluent-bit
LAST DEPLOYED: Mon Mar 31 09:02:01 2025
NAMESPACE: logging
STATUS: deployed
REVISION: 3
NOTES:
Get Fluent Bit build information by running these commands:
export POD_NAME=$(kubectl get pods --namespace logging -l "app.kubernetes.io/name=fluent-bit,app.kubernetes.io/instance=fluent-bit" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace logging port-forward $POD_NAME 2020:2020
curl http://127.0.0.1:2020
# determine the Fluent Bit pod name.
#
$ kubectl --kubeconfig target/2nodeconfig.yaml get pods --namespace logging
NAME READY STATUS RESTARTS AGE
fluent-bit-ff456 1/1 Running 0 39s
To verify the logs are now being collected and parsed into the JSON format, you run the following command.
# display Fluent Bit collected logs in JSON.
#
$ kubectl --kubeconfig target/2nodeconfig.yaml logs fluent-bit-ff456 --namespace logging
…
[CUT_LONG_LOG_OUTPUT_HERE]
{"date":"2025-03-31 07:35:40.172325","time":"2025-03-31T07:35:40.17232597Z","stream":"stdout","_p":"F","log":"2025-03-31T07:35:40.172158Z 0 [System] [MY-010931] [Server] /opt/bitnami/mysql/bin/mysqld: ready for connections. Version: '8.4.0' socket: '/opt/bitnami/mysql/tmp/mysql.sock' port: 3306 Source distribution."}
The date value is now a more readable format.
Next, note that there are two kinds of log lines in your output. The key ‘stream’ shows us that some of the logs are just ‘stdout’ or informational, and others are ‘stderr’ messages. They are hard to see and find so let’s filter out the non-critical messages by adding a grep filter section to our Fluent Bit pipeline configuration as follows in the file support/fluentbit-helm.yaml.
…
filters:
- name: grep
match: '*'
regex: stream stderr
…
This filter is used to target all lines that have a key – value pair matching ‘stream’ and ‘stderr’, so that we filter out all non-error log lines. Update your Fluent Bit using the same Helm upgrade command as before.
# Update Fluent Bit with new configuration changes.
#
$ helm upgrade --kubeconfig target/2nodeconfig.yaml --install fluent-bit fluent/fluent-bit --set image.tag=3.2.9 --namespace=logging --create-namespace --values=support/fluentbit-helm.yaml'
Release "fluent-bit" has been upgraded. Happy Helming!
NAME: fluent-bit
LAST DEPLOYED: Mon Mar 31 09:45:24 2025
NAMESPACE: logging
STATUS: deployed
REVISION: 3
NOTES:
Get Fluent Bit build information by running these commands:
export POD_NAME=$(kubectl get pods --namespace logging -l "app.kubernetes.io/name=fluent-bit,app.kubernetes.io/instance=fluent-bit" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace logging port-forward $POD_NAME 2020:2020
curl http://127.0.0.1:2020
# determine the Fluent Bit pod name.
#
$ kubectl --kubeconfig target/2nodeconfig.yaml get pods --namespace logging
NAME READY STATUS RESTARTS AGE
fluent-bit-2n7qd 1/1 Running 0 19s
To verify the logs are now being collected and parsed into the JSON format, you run the following command.
# display Fluent Bit collected logs in JSON.
#
$ kubectl --kubeconfig target/2nodeconfig.yaml logs fluent-bit-2n7qd --namespace logging
…
[CUT_LONG_LOG_OUTPUT_HERE]
{"date":"2025-03-31 07:35:39.807821","time":"2025-03-31T07:35:39.807821642Z","stream":"stderr","_p":"F","log":"\u001b[38;5;6mmysql \u001b[38;5;5m07:35:39.80 \u001b[0m\u001b[38;5;2mINFO \u001b[0m ==> ** MySQL setup finished! **"}
{"date":"2025-03-31 07:35:39.820350","time":"2025-03-31T07:35:39.82035019Z","stream":"stderr","_p":"F","log":"\u001b[38;5;6mmysql \u001b[38;5;5m07:35:39.81 \u001b[0m\u001b[38;5;2mINFO \u001b[0m ==> ** Starting MySQL **"}
Now all the Ghost CMS logs are being collected by Fluent Bit at the edge where it’s running, but Fluent Bit filters out non-essential log entries and only forwards the error log messages.
As a final exercise, you can modify any part of your logs to add, remove, or update information. Let’s showcase this by adding another filter configuration to modify our steam value to a very clear error indicator as follows in the file support/fluentbit-helm.yaml.
…
filters:
- name: grep
match: '*'
regex: stream stderr
- name: modify
match: '*'
condition: Key_Value_Equals stream stderr
remove: stream
add:
- STATUS REALLY_BAD
- ACTION CALL_SRE
…
This filter is used to target all lines that have a key – value pair matching ‘stream’ and ‘stderr’. It will then remove the ‘steam’ key – value pair, and add in two new key – value pairs indicating action needed. Update your Fluent Bit using the same Helm upgrade command as before and inspect the new Fluent Bit instance logs to verify your actions are working in the log output.
…
[CUT_LONG_LOG_OUTPUT_HERE]
{"date":"2025-03-31 07:35:39.820350","time":"2025-03-31T07:35:39.82035019Z","_p":"F","log":"\u001b[38;5;6mmysql \u001b[38;5;5m07:35:39.81 \u001b[0m\u001b[38;5;2mINFO \u001b[0m ==> ** Starting MySQL **","STATUS":"REALLY_BAD","ACTION":"CALL_SRE"}
As you can see, the new STATUS and ACTION keys have been added to all lines with an error.
What’s next
Being able to take control of your logs with Fluent Bit has been easy to do. The provided example demo environment and actions taken are easy to understand because they are focusing on a very limited environment. When you start to scale out your activities to span multiple clusters and multiple Fluent Bit instances you are going to start needing help with management of your log strategy. Up next, we continue our journey by forwarding our log streams to Chronosphere Telemetry Pipeline and see how we can manage our log strategy at scale.
Buyer’s Guide: Telemetry Pipelines
Build a smarter telemetry pipeline. Download The Buyer’s Guide to Telemetry Pipelines