As we have written previously, having access to Kubernetes metadata can enhance traceability and significantly reduce mean time to remediate (MTTR). However, the metadata you need may not be included in the logs. The Fluent Bit Kubernetes filter plugin makes it easy to enrich your logs with the metadata you need to troubleshoot issues.

When run in Kubernetes (K8s) as a daemonset, Fluent Bit can ingest Kubelet logs and enrich them with additional metadata from the Kubernetes API server. This includes any annotations or labels on the pod and information about the namespace, pod, and the container the log is from. It is very simple to do, and, in fact, it is also the default setup when deploying Fluent Bit via the helm chart.

The documentation goes into the full details of what metadata are available and how to configure the Fluent Bit Kubernetes filter plugin to gather them. In this post, we’ll give an overview of how the filter works and provide common troubleshooting tips, particularly with issues caused by misconfiguration.

How to get Kubernetes metadata?

Let us take a step back and look at what information is required to query the K8s API server for metadata about a particular pod. We need two things:

  1. Namespace
  2. Pod name

Cunningly, the Kubelet logs on the node have to provide this information in their filename by design. This information enables Fluent Bit to query the K8s API server when all it has is the log file. Therefore, given a pod log file(name), we should be able to query the K8s API server for the rest of the metadata describing the pod.

Using Fluent Bit to enrich the logs

First off, we need the actual logs from the Kubelet. This is typically done by using a daemonset to ensure a Fluent Bit pod runs on every node and then mounts the Kubelet logs from the node into the pod.

Now that we have the log files themselves, we should be able to extract enough information to query the K8s API server. We do this with a default setup using the tail plugin to read the log files and inject the filename into the tag:

[INPUT]
    Name tail
    Tag kube.*
    Path /var/log/containers/*.log
    multiline.parser  docker, cri

Wildcards in the tag are handled in a special way for the tag filter. This configuration injects the full path and filename for the log file into the tag after the kube. prefix.

Once the kubernetes filter receives these records, it parses the tag to extract the information required. To do so, it needs the kube_tag_prefix value to strip off any redundant tag or path to leave just the log filename with the three things required to query the K8s API server. Using the defaults would look like this:

[FILTER]
    Name             kubernetes
    Match            kube.*
    Kube_Tag_Prefix  kube.var.log.containers.

Fluent Bit inserts the extra metadata from the K8s API server under the top-level kubernetes key.

Using an example, we can see how this flows through the system.

Assume this is our log file:

/var/log/container/apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log

The resulting tag would be (slashes are replaced with dots):

kube.var.log.containers.apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log

We then strip off the kube_tag_prefix:

apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log

Now we can extract the relevant fields with a regex:

(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<container_id>[a-z0-9]{64})\.log$
A Fluent Bit diagram illustrating the process flow, including Kubelet Log, Tag Generation, Prefix Stripping, Namespace & Pod Extraction, querying the Kubernetes API Server with a Kubernetes filter for adding metadata enrichment to logs.
The Fluent Bit Kubernetes filter extracts information from the log filename in order to query the K8s API server to retrieve metadata that is then added to the log file.

The Fluent Bit Kubernetes filter extracts information from the log filename in order to query the K8s API server to retrieve metadata that is then added to the log file.

Troubleshooting misconfiguration woes

While the Fluent Bit Kubernetes filter takes care of the hard part of extracting the K8s metadata from the API server and adding them to the logs, when users experience difficulty, it is usually the result of misconfiguration. There are a few common errors that we frequently see in community channels:

  1. Mismatched configuration of tags and prefix
  2. Invalid RBAC/unauthorised
  3. Dangling symlinks for pod logs
  4. Caching affecting dynamic labels
  5. Incorrect parsers

Let’s discuss how to identify these issues and correct them.

Mismatched tag and tag prefix

The most common problems occur when the default tag is changed for the tail input plugin or when a different path is used for the logs. When this happens, the kube_tag_prefix must also be changed to ensure it strips everything off except the filename.

The kubernetes filter will otherwise end up with a garbage filename that it either complains about immediately, or it injects invalid data into the request to the K8s API server. In either case, the filter will not enrich the log record as it has no additional data to add.

Typically, you will see a warning message in the log if the tag is obviously wrong, or with log_level debug, you can see the requests to the K8s API server with invalid pod name or namespace plus the response indicating there is no such pod.

$ kubectl logs fluent-bit-cs6sg
…
[2023/11/30 10:08:14] [debug] [filter:kubernetes:kubernetes.0] Send out request to API Server for pods information
[2023/11/30 10:08:14] [debug] [http_client] not using http_proxy for header
[2023/11/30 10:08:14] [debug] [http_client] server kubernetes.default.svc:443 will close connection #60
[2023/11/30 10:08:14] [debug] [filter:kubernetes:kubernetes.0] Request (ns=default, pod=s.fluent-bit-cs6sg) http_do=0, HTTP Status: 404
[2023/11/30 10:08:14] [debug] [filter:kubernetes:kubernetes.0] HTTP response
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"s.fluent-bit-cs6sg\" not found","reason":"NotFound","details":{"name":"s.fluent-bit-cs6sg","kind":"pods"},"code":404}

This example was created using a configuration file like below for the official helm chart. As you can see we have added two characters to the default tag prefix (my) and you can see above in the details for the error that the name of the pod has two extra characters in the prefix: it should be fluent-bit-cs6sg but is s.fluent-bit-cs6sg, no such pod exists so it reports a failure. Without log_level debug you just get no metadata.

config:
  service: |
    [SERVICE]
        Daemon Off
        Log_Level debug

  inputs: |
    [INPUT]
        Name tail
        Path /var/log/containers/*.log
        multiline.parser docker, cri
        Tag mykube.*
        Mem_Buf_Limit 5MB
        Skip_Long_Lines On

  filters: |
    [FILTER]
        Name kubernetes
        Match *

Unexpected tags

Using wildcards in the tail input plugin can trip you up sometimes: the * wildcard is replaced by the full path of the file but with any special characters (e.g. /) replaced with dots (.).

Beware of modifying the default kube.* tag in this case, and — as I try to stress as much as possible — use stdout to see the actual tags you are getting if you have any issues. As an example, consider the following tail configuration:

[INPUT]
    Name tail
    Path /var/log/containers/*.log

Now, you will get tags that look like this depending on what you configure:

  1. Tag kube.* ⇒ kube.var.log.containers.<filename>
  2. Tag kube_* ⇒ kube_.var.log.containers.<filename>

In the second case, notice that we have an underscore followed by a dot. Whereas, in the first case, there is no double dot as it is automatically collapsed by the input plugin. This can mean your filters do not match later on and can cause confusing problems. The first step is always the trusty stdout output, though, to verify.

Invalid RBAC

The Fluent Bit pod must have the relevant roles added to its service account that allow it to query the K8s API for the information it needs. Unfortunately, this error is typically just reported as a connectivity warning to the K8s API server, so it can be easily missed.

To troubleshoot this issue, use log_level debug to see the response from the K8s API server. The message will basically say “missing permissions to do X” or something similar and then it is obvious what is wrong.

[2022/12/08 15:53:38] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2022/12/08 15:53:38] [debug] [filter:kubernetes:kubernetes.0] Send out request to API Server for pods information
[2022/12/08 15:53:38] [debug] [http_client] not using http_proxy for header
[2022/12/08 15:53:38] [debug] [http_client] server kubernetes.default.svc:443 will close connection #23
[2022/12/08 15:53:38] [debug] [filter:kubernetes:kubernetes.0] Request (ns=default, pod=calyptia-cluster-logging-316c-dcr7d) http_do=0, HTTP Status: 403
[2022/12/08 15:53:38] [debug] [filter:kubernetes:kubernetes.0] HTTP response
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"calyptia-cluster-logging-316c-dcr7d\" is forbidden: User \"system:serviceaccount:default:default\" cannot get resource \"pods\" in API group \"\" in the namespace \"default\"","reason":"Forbidden","details":{"name":"calyptia-cluster-logging-316c-dcr7d","kind":"pods"},"code":403}

[2022/12/08 15:53:38] [ warn] [filter:kubernetes:kubernetes.0] could not get meta for POD calyptia-cluster-logging-316c-dcr7d

In the example above you can see without log_level debug all you will get is the warning message:

[2022/12/08 15:53:38] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2022/12/08 15:53:38] [ warn] [filter:kubernetes:kubernetes.0] could not get meta for POD calyptia-cluster-logging-316c-dcr7d

Kubernetes has evolved over the years, and new container runtimes have also come along. As a result, the filename requirements for Kubelet logs may be handled using a symlink from a correctly named pod log file to the actual container log file created by the container runtime. When mounting the pod logs into your container, ensure they are not dangling links and that their destination is also correctly mounted.

Caching

Fluent Bit caches the response from the K8s API server to prevent rate limiting or overloading the server. As a result, if annotations or labels are applied or removed dynamically, then those changes will not be seen until the next time the cache is refreshed. A simple test is just to roll/delete the pod so a fresh one is deployed and check if it picks up the changes.

Log file parsing

Another common misconfiguration is using custom container runtime parsers in the tail input. This problem is generally a legacy issue as previously, there were no inbuilt CRI or docker multiline parsers. The current recommendation is always to configure the tail input using the provided parsers as per the documentation:

[INPUT]
    name              tail
    path              /var/log/containers/*.log
    multiline.parser  docker, cri

Do not use your own CRI or docker parsers, as they must cope with merging partial lines (identified with a P instead of an F).

The parsers for the tail plugin are not applied sequentially but are mutually exclusive, with the first one matching being applied. The goal is to handle multiline logs created by the Kubelet itself. Later, you can have another filter to handle multiline parsing of the application logs themselves after they have been reconstructed here.

What’s Next?

To learn more about Fluent Bit, we recommend joining the Fluent Community Slack channel where you will find thousands of other Fluent Bit users. Engage with experts, ask questions, and share best practices. Many of the troubleshooting tips in this blog were originally surfaced in the Slack channel.

Join the Fluent Community Slack channel

We also invite you to download a free copy of Fluent Bit with Kubernetes by Phil Wilkins. This practical guide to monitoring cloud native and traditional environments with Fluent Bit covers the basics of collecting app logs, filtering, routing, enriching, and transforming logs, metrics, and traces.

Advertisement for a free expanded copy of “Fluent Bit with Kubernetes” by Phil Wilkins, featuring logos of Chronosphere and Calyptia, and a "MEAP" stamp on the cover. Learn how to convert logs to metrics seamlessly. Download now!

Download Fluent Bit with Kubernetes

Fluent Bit & K8s

Fluent Bit is a widely-used open-source data collection agent, processor, and forwarder that enables you to collect logs, metrics, and traces from various sources, filter and transform them, and then forward them to multiple destinations.

In fact, if you are using Kubernetes on a public cloud provider odds are that you are already running Fluent Bit. Fluent Bit is deployed by default in major Kubernetes distributions, including Google Kubernetes Engine (GKE), AWS Elastic Kubernetes Service (EKS), and Azure Kubernetes Service (AKS) and is used to route data to the cloud providers’ backend (e.g. CloudWatch, Azure Monitor Logs, or Google Cloud Logging).

In this post, we will demonstrate how to collect logs from different Kubernetes components (such as kube-proxy, kubelet, etc.) and send them to the destination of your choice. We’ll send them to Elasticsearch, but Fluent Bit also supports more than 40 other destinations.

Note: This guide exclusively focuses on the logs of Kubernetes components. If you wish to collect logs from application containers running on Kubernetes, please refer to this guide.

Two abstract human figures stack various geometric shapes. One figure hands a blue cube to the other. A Kubernetes logo appears on a cylinder on the left, indicating integration with component logs. A Fluent Bit icon hovers above, symbolizing efficient log forwarding.

Prerequisites

Identifying available Kubernetes components

Let’s begin by identifying the names of the Kubernetes components from which we need to collect the logs.

According to the official documentation of Kubernetes, Kubernetes clusters are generally composed of the following:

Control plane components

Worker node components

The above lists contain the typical components you find in a Kubernetes cluster, but it’s not necessarily an accurate starting point. Various flavors of Kubernetes exist, such as Self-Hosted, Managed Services, Openshift, Cluster API, etc. As a result, the specific component list might differ depending on the cluster you are working with.

For example, since we are using a managed Kubernetes cluster (EKS) from AWS, we don’t have control over the control plane components. These are entirely managed by AWS, and the logs of these control plane nodes are available in CloudWatch directly.

However, we do have control over the worker nodes. So our component list is as shown below:

Suppose you were using a self-hosted Kubernetes cluster on-premises. In that case, your list would include all the components we mentioned earlier.

Moving forward, the new list we’ve outlined has another complexity: Kubernetes offers two options for running the control plane components. They can be executed either as a server in the host or a Kubernetes pod in a worker node (see Kubernetes docs).

For our EKS cluster, the kubelet and container-runtime run as daemon processes on the host machine, while the kube-proxy and cni-plugin run as Kubernetes pods.

Below is our final list for EKS components with some additional information attached to it.

# Below components run as Daemon Processes
1. Kubelet
   Service Name: "kubelet.service"
2. Container Runtime (containerd)
   Service Name: "containerd.service"

# Below components run as Containers
1. Kube Proxy
   Namespace: "kube-proxy"
   Resource: "daemonset"
   Resource Name: "kube-proxy"
2. CNI Plugin (VPC CNI)
   Namespace: "kube-proxy"
   Resource: "daemonset"
   Resource Name: "aws-node"

To summarize, here’s a three-step process for selecting the components from which to gather logs:

  1. Identify all the components in your cluster.
  2. Disregard any component over which you don’t have control.
  3. Identify where the control plane component is being run, and collect additional information about it.

With the components list ready, it’s time to configure Fluent Bit.

Selecting the input plugin

From our components list, we can see that we have two different types of data sources:

  1. containers
  2. daemon processes.

Fluent Bit offers an input plugin for each of these data sources.

Tail plugin: for reading log files

Containers store their logs in plain text files, which can be read by standard programs (like cattail, etc.). The Tail Plugin operates similarly to the Linux tail command, where you specify the file path as an argument to read a specific file. In this context, the plugin takes the Path as a parameter to read files on the host machine. Since we’re using containerd as our container runtime, pod logs are stored in a nested directory structure at /var/log/pods.

 /var/log/pods/                             # Root directory for pod logs
|
|-- <namespace>_<pod-name>_<pod-uuid>/     # Directory for a specific pod
|   |
|   |-- <container-name>/                  # Directory for a specific container within the pod
|       |
|       |-- 0.log                          # Log file for the container's first attempt (can increment for restarts)

However, we will utilize the /var/log/containers directory, which contains symbolic links to all files in the /var/logs/pods directory. This directory is preferred as it stores files in a flat structure, with no nested directories.

Screenshot of a terminal displaying log entries with various timestamps, file paths, and system activities related to Kubernetes component logs like controllers and Pods collected via Fluent Bit.

To select only aws-node and kube-proxy log files from many others in the /var/log/containers directory, we’ll leverage Linux pattern matching feature. Observing the file names, we can create a pattern that selects specific files using *&lt;namespace-name>_&lt;pod-name>*. Our final paths for the log files will look like: /var/log/containers/*kube-system_kube-proxy*/var/log/containers/*kube-system_aws-node*

[INPUT]
   Name  tail
   Tag   kubernetes.core.containers*
   Path  /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
   multiline.parser docker, cri    
   Read_from_Head true

For more information on the Tail Plugin of Fluent Bit, follow the official documentation.

Systemd plugin: for reading logs of daemon processes

On Linux machines, daemon processes are controlled using the systemctl CLI. These processes store logs in a binary format in the /var/log/journal directory. Since the Tail Plugin cannot read these binary files directly, we use the systemd plugin, which handles format conversion and displays logs in a human-readable format. This plugin provides the Systemd_Filter parameter to specify the specific service name from which to read logs.

Our Fluent Bit configuration for the systemd plugin aligns with our component list as shown below:

[INPUT]
    Name            systemd
    Tag             kubernetes.*
    Systemd_Filter  _SYSTEMD_UNIT=kubelet.service
    Systemd_Filter  _SYSTEMD_UNIT=containerd.service

Note: If you specify a service that does not exist, Fluent Bit will implicitly ignore it.

For more information on the Systemd Plugin of Fluent Bit, follow the official documentation.

Final Fluent Bit input configuration

Combining both plugins, our final Fluent Bit input configuration will look like this:

[INPUT]
    Name  tail
    Tag   kubernetes.core.containers*
    Path  /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
    multiline.parser docker, cri    
    Read_from_Head true

[INPUT]
    Name            systemd
    Tag             kubernetes.*
    Systemd_Filter  _SYSTEMD_UNIT=kubelet.service
    Systemd_Filter  _SYSTEMD_UNIT=containerd.service

Note: The above configuration is derived from our components list, which might be different for a different Kubernetes cluster. This results in different configuration values, but they will be along the same lines.

With our configuration ready, let’s move forward and begin implementing it in Kubernetes.

Applying configuration in Kubernetes

A terminal window displaying lines of Kubernetes component logs with various debug and info messages, timestamps, and labels, all efficiently collected by Fluent Bit.

We will deploy Fluent Bit using the Helm chart available at Fluent Bit Helm Chart.

Instructions:

1) Add Fluent Bit Helm Repo
Use the command below to add the Fluent Bit Helm repository:

helm repo add fluent <https://fluent.github.io/helm-charts>

2) Configure Fluent Bit
The default Helm chart configuration of Fluent Bit reads container logs and sends them to an Elasticsearch cluster. Before sending logs to Elasticsearch, we would like to test the configuration, so we have added a stdout output plugin to view logs in stdout itself for verification.

3) Override the default configuration
Create a file called values.yaml with the following contents:

config:
  inputs: |
    [INPUT]
        Name            systemd
        Tag             kubernetes.*
        Systemd_Filter  _SYSTEMD_UNIT=kubelet.service
        Systemd_Filter  _SYSTEMD_UNIT=containerd.service

    [INPUT]
        Name  tail
        Tag   kubernetes.containers*
        Path  /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
        multiline.parser docker, cri    
        Read_from_Head true

  outputs: |
    [OUTPUT]
        Name   stdout
        Match  *

4) Deploy Fluent Bit

Use the command below:

helm upgrade -i fluent-bit fluent/fluent-bit --values values.yaml

5) Wait for Fluent Bit pods to run

Ensure that the Fluent Bit pods reach the Running state.

kubectl get pods

6) Verify Fluent Bit is working

Use the command below to verify that Fluent Bit is reading the logs of the Kubernetes components that we configured:

kubectl logs <fluent-bit-pod-name> -f

Search the logs for the components mentioned in our list in the output such as containerd and kube-proxy.

Debugging tips

If you are unable to view the logs of any of the expected Kubernetes components, check the following:

  1. Ensure these two host volumes are attached to the Fluent Bit pod: /var/log/ and /etc/machine-id.
  2. SSH into your machine and verify whether the daemon processes are running with the same names mentioned in the configuration. Validate using the following command: systemctl | grep "kubelet.service\\|containerd.service" (or alternately systemctl status &lt;service-name>).

Sending logs To Elasticsearch

With a working configuration, we can now send this data to any available Fluent Bit output plugins. Since we decided to send data to Elasticsearch, modify the output configuration in values.yaml by adding the es output plugin, and then apply the configuration using helm apply.

For more information on ES plugin check the official documentation, or see our previous post for a step-by-step tutorial for sending logs to Elasticsearch.

config:
  inputs: |
    [INPUT]
        Name            systemd
        Tag             kubernetes.*
        Systemd_Filter  _SYSTEMD_UNIT=kubelet.service
        Systemd_Filter  _SYSTEMD_UNIT=containerd.service

    [INPUT]
        Name  tail
        Tag   kubernetes.containers*
        Path  /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
        multiline.parser docker, cri    
        Read_from_Head true

  outputs: |
    [OUTPUT]
        Name es
        Match *
        Host <your-host-name-of-elastic-search>
        Logstash_Format On
        Retry_Limit False

And you should be able to see logs in Elasticsearch.

A dashboard displays a bar graph of timestamped hits and tabular logs with details such as time, log message, and event tags, all seamlessly integrated with Fluent Bit for optimal Kubernetes log management.

Conclusion

This guide provided an overview of configuring and deploying Fluent Bit in a Kubernetes environment to manage logs from containers and daemon processes. By utilizing the Helm chart along with specific input and output plugins, administrators can streamline log management and forwarding, ensuring vital data is accessible and analyzable.

Get a free copy of Fluent Bit with Kubernetes

A promotional image displaying an offer for a free copy of the book “Fluent Bit with Kubernetes.” The book cover shows an illustration of a person and MEAP stamped on it. Logos of Chronosphere and Calyptia are also visible.

If you would like to learn more about using Fluent Bit with containerized cloud native environments, we invite you to download a free copy of Fluent Bit with Kubernetes by Phil Wilkins. In this 300+ page book, you’ll learn how to establish and optimize observability systems for Kubernetes, and more. From fundamental configuration to advanced integrations, this book lays out Fluent Bit’s full capabilities for log, metric, and trace routing and processing.

Download Fluent Bit with Kubernetes

Originally created at Treasure Data more than a decade ago, Fluentd is a widely adopted open source data collection project. Both Fluentd and Fluent Bit, a related project specifically created for containerized environments, utilize Forward Protocol for transporting log messages. In this post, we’ll explain Forward Protocol and its uses and provide an example of using Fluentd and Fluent Bit together to collect and route logs to a MongoDB database for storage.

What is Forward Protocol?

Forward Protocol is a network protocol used by Fluentd to transport log messages between nodes. It is a binary protocol that is designed to be efficient and reliable. It uses TCP to transport messages and UDP for the heartbeat to check the status of servers.

It is a lightweight and efficient protocol that allows for the transmission of logs across different nodes or systems in real time. The protocol also supports buffering and retransmission of messages in case of network failures, ensuring that log data is not lost.

Fluentd Forward Client and Server are two components of the Fluentd logging system that work together to send and receive log data between different sources and destinations. It uses message pack arrays to communicate and also has options for authentication and authorization to ensure only authorized entities have access to send & receive logs. Read more about the forward protocol.

A diagram showing the message flow between a Fluent forward client and a Fluent forward server, detailing steps for authorization, authentication, and data exchange using msgpack arrays and pingpong over the Forward Protocol.
Fluent forward client-server architecture

Forward protocol offers the following benefits:

Wider support for Forward Protocol

Apart from Fluentd and Fluent Bit, there’s also a Docker log driver that uses Forward Protocol to send container logs to the Fluentd collector.

The OpenTelemetry Collector also supports receiving logs using Forward Protocol.

Understanding Fluent Bit and Fluentd

Both Fluentd and Fluent Bit are popular logging solutions in the cloud-native ecosystem. They are designed to handle high volumes of logs and provide reliable log collection and forwarding capabilities. Fluent Bit is lightweight and more suitable for edge computing and IoT use cases.

In this section, we’ll take a closer look at the differences between the two tools and understand a use case when you’d want to use both of them together.

FluentdFluent Bit
ScopeContainers / ServersEmbedded Linux / Containers / Servers
LanguageC & RubyC
Memory> 60MB~1MB
PerformanceMedium PerformanceHigh Performance
DependenciesBuilt as a Ruby Gem, it requires a certain number of gems.Zero dependencies, unless some special plugin requires them.
PluginsMore than 1000 external plugins are availableMore than 100 built-in plugins are available
LicenseApache License v2.0Apache License v2.0

Both Fluent Bit and Fluentd can be used as forwarders or aggregators and can be used together or as a standalone solution. One use case for using Fluent Bit and Fluentd together is by using Fluent Bit to collect logs from containerized applications running in a Kubernetes cluster. Because Fluent Bit has a very small footprint, it can be deployed on every node. Meanwhile, Fluentd can be used for collecting logs from various sources outside of Kubernetes, such as servers, databases, and network devices.

Ultimately, the choice between Fluentd and Fluent Bit depends on the specific needs and requirements of the use case at hand.

In the next section, we’ll explore how we can use Forward Protocol to push data from Fluent Bit to Fluentd.

Read more about Fluent Bit and Fluentd use cases

Using Forward Protocol with Fluent Bit and Fluentd

To understand how Forward Protocol works, we’ll set up instances of both Fluent Bit and Fluentd. We’ll collect CPU logs using Fluent Bit, and, using Forward Protocol, we’ll send them to Fluentd. From there, we will push the logs to MongoDB Atlas.

Diagram illustrating Fluent Bit forwarding logs (CPU, syslogs, kernel logs) via Forward Protocol to Fluentd. Fluentd then sends the logs to MongoDB Atlas using the out_mongo plugin.
Our proposed data pipeline

MongoDB Atlas configuration

MongoDB Atlas is a cloud-based database service that allows users to easily deploy, manage, and scale MongoDB databases. It offers features such as automatic backups, monitoring, and security, making it a convenient and reliable option for managing data in the cloud. Hence, we’ll be pushing our logs to MongoDB from Fluentd.

In order to do that, we need to do the following:

Apart from this, you might also have to do the following:

Fluentd configuration

The first step is to configure Fluentd to receive input from a forward source. After you install Fluentd, you need to update the configuration file with the following:

<source>
  type forward
  bind 0.0.0.0
  port 24224
</source>

<match fluent_bit>
  type stdout
</match>

<match fluent_bit>
  @type mongo
  database fluentd
  collection fluentdforward
  connection_string "mongodb+srv://fluentduser:[email protected]/test?retryWrites=true&w=majority"
</match>

In the above configuration, we are defining the source type to be forward and providing a bind address and port. We’re also providing a match filter which is `fluent_bit`, so any log it finds with this tag will be consumed. The input logs will be sent to MongoDB Atlas for which we’ve provided the database, collection, and connection_string.

After this, all you need to do is start the Fluentd service if it is not running already. It will not show any output at the moment since we have not yet configured Fluent Bit to forward the logs.

Fluent Bit Configuration

On the Fluent Bit side, we need to configure the INPUT and OUTPUT plugins.

INPUT

[INPUT]
    Name cpu
    Tag fluent_bit

[INPUT]
    Name kmsg
    Tag fluent_bit

[INPUT]
    Name systemd
    Tag fluent_bit

With this, we are collecting the CPU, kernel, and systemd logs and applying a `fluent_bit` tag to them.

OUTPUT

[OUTPUT]
    Name forward
    Match *
    Host 127.0.0.1
    Port 24224

For output, we’re using a forward output plugin that routes the logs to the specified host and port.

Save the configuration and restart the Fluent Bit service. If everything is correct, you’ll see the logs being streamed by Fluentd. Navigate to your MongoDB Atlas UI and refresh the collection, you should be able to see the logs as shown below.

A screenshot shows the "Collections" tab in a database management interface named "fluentd," displaying two JSON query results named "test.fluentforward" with document details, leveraging the Forward Protocol.
MongoDB Atlas UI showing our logs arriving

This way we are able to make use of the forward plugin and share logs between Fluent Bit and FluentD. You can use Forward Protocol with other products that support it to gather logs from different sources and push them to different tools.

Improve your skills with Fluent Bit Academy

To learn more about Fluent Bit and its powerful data processing and routing capabilities, check out Fluent Bit Academy. It’s filled with on-demand videos guiding you through all things Fluent Bit— best practices and how-to’s on advanced processing rules, routing to multiple destinations, and much more. Here’s a sample of what you can find there:

Your destination for best practices and training on all things Fluent Bit

Visit Fluent Bit Academy

About Fluent Bit and Chronosphere

With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.

Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.