Administrators using Internet Information Services (IIS) to host websites know that IIS logs can be difficult to search and analyze, especially when they are under pressure to identify the cause of an outage or performance issues. Event Viewer, the Windows default application for searching and analyzing logs is unintuitive for users. Many users prefer other tools, which typically require converting the logs into JSON. Although IIS logs are flat files, with each line containing data about an individual web hit, similar to Apache or Nginx, IIS logs are not as easy to format into JSON as Apache and Nginx logs.
In this post, we’ll demonstrate a better approach to IIS logging. We’ll show you how to configure IIS to enrich the logs with non-standard metadata. We’ll then collect the logs with Fluent Bit where we will use a custom Wasm plugin to transform and enrich the data. Finally, we’ll have Fluent Bit route our data (now formatted as JSON) to ClickHouse for storage where we can then extract it using Grafana for visualization and analysis.
Fluent Bit is a fast, lightweight, and highly scalable log, metric, and trace processor and forwarder that has been deployed billions of times. It is a Cloud Native Computing Foundation graduated open-source project with an Apache 2.0 license.
Fluent Bit uses a pluggable architecture, enabling new data sources and destinations, processing filters, and other new features to be added with approved plugins. Although there are dozens of supported plugins, there may be times when no out-of-the-box plugin accomplishes the exact task you need.
Thankfully, Fluent Bit lets developers write custom scripts using Lua or WebAssembly for such instances.
WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications. Developer reference documentation for Wasm can be found on MDN’s WebAssembly pages.
This post covers how Wasm can be used with Fluent Bit to implement custom logic and functionalities.
To achieve the desired outcomes, several tasks need to be addressed. Firstly, data validation should be implemented to ensure the accuracy and integrity of the processed information. Additionally, we should perform type conversion to ensure compatibility and consistency across different data formats.
Moreover, integrating external resources such as APIs or databases can enhance the logs by providing additional relevant information. It is crucial to apply backward compatibility, maintainability, and testability principles to the source code to ensure its longevity and ease of future modifications.
Specifically, we’ll demonstrate how to collect and parse Internet Information Services (IIS) w3c logs (with some custom modifications) and transform the raw string into a structured JSON record.
Organizations need to collect and parse logs generated by IIS (Internet Information Services). In this particular use case, we will explore the significance of utilizing the Fluent Bit WebAssembly (Wasm) plugin to create custom modifications for logs collected in the w3c format.
By leveraging the Fluent Bit Wasm plugin, organizations can enhance their log processing capabilities by implementing tailored transformations and enrichments specific to their requirements. This ability empowers them to extract valuable insights and gain a deeper understanding of their IIS logs, enabling more effective troubleshooting, monitoring, and analysis of their web server infrastructure.
The following diagram provides an overview of the actions we will take:
This diagram highlights an interesting aspect, namely the introduction of WebAssembly in Fluent Bit. In previous versions of Fluent Bit, the workflow for this use case was relatively straightforward. Log information was extracted using parsers that relied on regular expressions or Lua code.
However, with the introduction of the Wasm plugin, Fluent Bit now offers a more versatile and powerful approach to log extraction and processing. Wasm enables the implementation of custom modifications and transformations, allowing for greater flexibility and efficiency in handling log data. This advancement in Fluent Bit’s capabilities opens up new possibilities for extracting and manipulating log information, ultimately enhancing the overall log processing workflow.
Currently, Fluent Bit offers an ecosystem of plugins, filters, and robust parsers through which you can perform pipelines and routing of different workflows.
It is possible to create parsers using regular expressions and components using programming languages such as C, Golang, and Rust using Wasm.
Our use case shows how to use Rust to develop a Wasm plugin.
Note: One of the reasons for using Rust as a programming language is that I previously developed a PoC project to learn Rust. The idea was to create a Fluent Bit-inspired log collector for IIS files, parse the logs, and send them to various destinations (Kafka, Loki, Postgres). Having that code base turned out to be interesting for combining existing logic with the proposal offered by Fluent Bit to integrate Rust with Wasm into its ecosystem.
Configure the IIS log output standard:
By default, IIS w3c logs include fields that may not always provide relevant information for defining usage metrics and access patterns. Additionally, these logs may not cover custom fields specific to our use case.
One example is the c-authorization-header
field, which is essential for our analysis but not included in the default log format. Therefore, it becomes necessary to customize the log configuration to include this field and any other relevant custom fields crucial to our specific requirements.
This customization ensures we can access all the necessary information to accurately define metrics and gain insights into our IIS server’s usage and access patterns.
date time s-sitename s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port c-ip cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-bytes cs-bytes time-taken c-authorization-header.
To get started, we need to create a new project to construct the filter. Following the official documentation, run this command in our terminal:
cargo new flb_filster_plugin –lib
The command cargo new flb_filter_plugin --lib
is used in the Rust programming language to create a new project. The “–lib” flag specifies that the project should be created as a library project, which is suitable for developing Fluent Bit filter plugins.
Next, open the Cargo.toml file and add the following section:
[lib]
crate-type = ["cdylib"]
[dependencies]
serde = { version = “1.0.160”, features = [“derive”] } serde_json = “1.0.104” serde_bytes = “0.11” rmp-serde = “1.1” regex = “1.9.2” chrono = “0.4.24” libc = “0.2”
Next, open up src/lib.rs and overwrite it with the following entry point code. We will explain the code in the following section.
#[no_mangle]
pub extern "C" fn flb_filter_log_iis_w3c_custom(
tag: *const c_char,
tag_len: u32,
time_sec: u32,
time_nsec: u32,
record: *const c_char,
record_len: u32,
) -> *const u8 {
let slice_tag: &[u8] = unsafe { slice::from_raw_parts(tag as *const u8, tag_len as usize) };
let slice_record: &[u8] =
unsafe { slice::from_raw_parts(record as *const u8, record_len as usize) };
let mut vt: Vec<u8> = Vec::new();
vt.write(slice_tag).expect("Unable to write");
let vtag = str::from_utf8(&vt).unwrap();
let v: Value = serde_json::from_slice(slice_record).unwrap();
let dt = Utc.timestamp_opt(time_sec as i64, time_nsec).unwrap();
let time = dt.format("%Y-%m-%dT%H:%M:%S.%9f %z").to_string();
let input_logs = v["log"].as_str().unwrap();
let mut buf=String::new();
if let Some(el) = LogEntryIIS::parse_log_iis_w3c_parser(input_logs) {
let log_parsered = json!({
"date": el.date_time,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"s_ip": el.s_ip,
"cs_method": el.cs_method,
"cs_uri_stem": el.cs_uri_stem,
"cs_uri_query": el.cs_uri_query,
"s_port": el.s_port,
"c_ip": el.c_ip,
"cs_user_agent": el.cs_user_agent,
"cs_cookie": el.cs_cookie,
"cs_referer": el.cs_referer,
"cs_host": el.cs_host,
"sc_status": el.sc_status,
"sc_bytes": el.sc_bytes.parse::<i32>().unwrap(),
"cs_bytes": el.cs_bytes.parse::<i32>().unwrap(),
"time_taken": el.time_taken.parse::<i32>().unwrap(),
"c_authorization_header": el.c_authorization_header,
"tag": vtag,
"source": "LogEntryIIS",
"timestamp": format!("{}", time)
});
let message = json!({
"log": log_parsered,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"cs_host": el.cs_host,
"date": el.date_time,
});
buf= message.to_string();
}
buf.as_ptr()
}
This Rust code defines a function called flb_filter_log_iis_w3c_custom
, which is intended to be used as a filter plugin in Fluent Bit with the WebAssembly module.
let slice_tag: &[u8] = unsafe { slice::from_raw_parts(tag as *const u8, tag_len as usize) };
let slice_record: &[u8] =
unsafe { slice::from_raw_parts(record as *const u8, record_len as usize) };
let mut vt: Vec<u8> = Vec::new();
vt.write(slice_tag).expect("Unable to write");
let vtag = str::from_utf8(&vt).unwrap();
let v: Value = serde_json::from_slice(slice_record).unwrap();
let dt = Utc.timestamp_opt(time_sec as i64, time_nsec).unwrap();
let time = dt.format("%Y-%m-%dT%H:%M:%S.%9f %z").to_string();
The function takes several parameters: tag
, tag_len
, time_sec
, time_nsec
, record
, and record_len
. These parameters represent the tag, timestamp, and log record information passed from Fluent Bit.
The code then converts the received parameters into Rust slices (&[u8]) to work with the data. It creates a mutable vector (Vec<u8>
) called vt
and writes the tag data into it. The vtag
variable is created by converting the vt
vector into a UTF-8 string.
Next, the code deserializes the record data into a serde_json::
Value object called v.
The incoming structured logs are:
{"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
It also converts the time_sec
and time_nsec
values into a DateTime object using the Utc.timestamp_opt
function.
The code then extracts specific fields from the v object and assigns them to variables. These fields represent various properties of an IIS log entry, such as date
, site name
, computer name
, IP address
, HTTP method,
URI, status codes
, and more.
let input_logs = v["log"].as_str().unwrap();
let mut buf=String::new();
if let Some(el) = LogEntryIIS::parse_log_iis_w3c_parser(input_logs) {
let log_parsered = json!({
"date": el.date_time,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"s_ip": el.s_ip,
"cs_method": el.cs_method,
"cs_uri_stem": el.cs_uri_stem,
"cs_uri_query": el.cs_uri_query,
"s_port": el.s_port,
"c_ip": el.c_ip,
"cs_user_agent": el.cs_user_agent,
"cs_cookie": el.cs_cookie,
"cs_referer": el.cs_referer,
"cs_host": el.cs_host,
"sc_status": el.sc_status,
"sc_bytes": el.sc_bytes.parse::<i32>().unwrap(),
"cs_bytes": el.cs_bytes.parse::<i32>().unwrap(),
"time_taken": el.time_taken.parse::<i32>().unwrap(),
"c_authorization_header": el.c_authorization_header,
"tag": vtag,
"source": "LogEntryIIS",
"timestamp": format!("{}", time)
});
let message = json!({
"log": log_parsered,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"cs_host": el.cs_host,
"date": el.date_time,
});
buf= message.to_string();
}
buf.as_ptr()
If the log entry can be successfully parsed using the LogEntryIIS::parse_log_iis_w3c_parser function, the code constructs a new JSON object representing the parsed log entry. It includes additional fields like the tag, source, and timestamp. The log entry and some specific fields are also included in a separate JSON object called message.
Finally, the code converts the message object to a string and assigns it to the buf variable. The function returns a pointer to the buf string, which will be used by Fluent Bit.
In summary, this code defines a custom filter plugin for Fluent Bit that processes IIS w3c log records, extracts specific fields, and constructs new JSON objects representing the parsed log entries.
The rest of the code is hosted at https://github.com/kenriortega/flb_filter_iis.git. It is an open-source project and currently provides two functions focused on the current need: parsing and processing a specific format. However, it is subject to new proposals and ideas to grow the project as a suite of possible use cases.
To compile this plugin, we suggest consulting the official Fluent Bit documentation for instructions to perform this process from your local environment and requirements for installing the Rust toolchain Wasm.
$ cargo build --target wasm32-unknown-unknown --release
$ ls target/wasm32-unknown-unknown/release/*.wasm
target/wasm32-unknown-unknown/release/filter_rust.wasm
In case you want to use the plugin from the repository, there is a release section where it is automatically compiled using GitHub actions.
To reproduce the demo, a docker-compose.yaml
file is attached within the repository, displaying the necessary resources for the below steps.
version: '3.8'
volumes:
clickhouse:
services:
clickhouse:
container_name: clickhouse
image: bitnami/clickhouse:latest
environment:
- ALLOW_EMPTY_PASSWORD=no
- CLICKHOUSE_ADMIN_PASSWORD=default
ports:
- 8123:8123
fluent-bit:
image: cr.fluentbit.io/fluent/fluent-bit
container_name: fluent-bit
ports:
- 8888:8888
- 2020:2020
volumes:
- ./docker/conf/fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- ./target/wasm32-unknown-unknown/release/flb_filter_iis_wasm.wasm:/plugins/flb_filter_iis_wasm.wasm
- ./docker/dataset\:/dataset/
grafana:
image: grafana/grafana:latest
environment:
- GF_PATHS_PROVISIONING=/etc/grafana/provisioning
- GF_AUTH_ANONYMOUS_ENABLED=false
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
depends_on:
- clickhouse
ports:
- "3000:3000"
We next configure Fluent Bit to process the logs collected from IIS. To make this tutorial more practical, we will use the dummy input plugin to generate sample logs. We provide several inputs to simulate the GET, POST, and status code 200, 401, 404, and 500 methods.
[INPUT]
Name dummy
Dummy {"log": "2023-07-20 17:18:54 W3SVC279 WIN-PC1 192.168.1.104 GET /api/Site/site-data qName=quww 13334 10.0.0.0 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/114.0.0.0+Safari/537.36+Edg/114.0.1823.82 _ga=GA2.3.499592451.1685996504;+_gid=GA2.3.1209215542.1689808850;+_ga_PC23235C8Y=GS2.3.1689811012.8.0.1689811012.0.0.0 http://192.168.1.104:13334/swagger/index.html 192.168.1.104:13334 200 456 1082 3131 Bearer+token"}
Tag log.iis.*
[INPUT]
Name dummy
Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 404 142 756 1078 -"}
Tag log.iis.get
[INPUT]
Name dummy
Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 POST / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 200 142 756 1078 -"}
Tag log.iis.post
[INPUT]
Name dummy
Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 POST/ - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 401 142 756 1078 -"}
Tag log.iis.post
[FILTER]
Name wasm
match log.iis.*
WASM_Path /plugins/flb_filter_iis_wasm.wasm
Function_Name flb_filter_log_iis_w3c_custom
accessible_paths .
This Fluent Bit filter configuration specifies the usage of a WebAssembly filter plugin to process log records that match the pattern log.iis.*.
The param Name
with value wasm
specifies the name of the filter plugin, which in this case is “wasm”.
The param WASM_Path specifies the path to the WebAssembly module file that contains the filter plugin implementation.
The param Function_Name: Specifies the name of the function within the WebAssembly module that will be used as a filter implementation.
The stdout output is used to check and visualize in the terminal the output result after filter processing.
[OUTPUT]
name stdout
match log.iis.*
The result is as follows:
2023-10-21 09:36:33 [0] log.iis.post: [[1697906192.407803136, {}], {"cs_host"=>"localhost", "date"=>"2023-08-11 19:56:44", "log"=>{"c_authorization_header"=>"-", "c_ip"=>"::1", "cs_bytes"=>756, "cs_cookie"=>"-", "cs_host"=>"localhost", "cs_method"=>"POST", "cs_referer"=>"-", "cs_uri_query"=>"-", "cs_uri_stem"=>"/", "cs_user_agent"=>"Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200", "date"=>"2023-08-11 19:56:44", "s_computername"=>"WIN-PC1", "s_ip"=>"::1", "s_port"=>"80", "s_sitename"=>"W3SVC1", "sc_bytes"=>142, "sc_status"=>"200", "source"=>"LogEntryIIS", "tag"=>"log.iis.post", "time_taken"=>1078, "timestamp"=>"2023-10-21T16:36:32.407803136 +0000"}, "s_computername"=>"WIN-PC1", "s_sitename"=>"W3SVC1"}]
The output is a log record that has been processed by Fluent Bit with the specified filter configuration. This transformation offers all the advantages of our code implementation, data validation, and type conversion.
# Should be optional.
[OUTPUT]
name http
tls off
match *
host clickhouse
port 8123
URI /?query=INSERT+INTO+fluentbit.iis+FORMAT+JSONEachRow
format json_stream
json_date_key timestamp
json_date_format epoch
http_user default
http_passwd default
To ingest these logs inside ClickHouse, we need to use the http output module. The http output plugin of Fluent Bit allows flushing records into an HTTP endpoint. The plugin issues a POST request with the data records in MessagePack (or JSON). The plugin supports dynamic tags, which allow sending data with different tags through the same input.
Please refer to the official documentation for more information on Fluent Bit’s HTTP output module.
The ClickHouse database must have the following configuration, which was taken from the article Sending Kubernetes logs To ClickHouse with Fluent Bit.
Following the next steps, we can continue with our use case. With the structured logs parsed by our filter, it is possible to perform queries that allow us to analyze the behavior of our websites and APIs hosted on IIS.
First, we need to create the database using your client of choice.
CREATE DATABASE fluentbit
SET allow_experimental_object_type = 1;
CREATE TABLE fluentbit.iis
(
log JSON,
s_sitename String,
s_computername String,
cs_host String,
date Datetime
)
Engine = MergeTree ORDER BY tuple(date,s_sitename,s_computername,cs_host)
TTL date + INTERVAL 3 MONTH DELETE;
This query is written in ClickHouse syntax, and it creates a database named “fluentbit” and a table named “iis” within that database. Let’s break down the query step by step:
CREATE DATABASE fluentbit
: This statement creates a new database named “fluentbit” if it doesn’t already exist.SET allow_experimental_object_type \= 1;
: This command supports experimental object types in ClickHouse. It allows you to use certain experimental features that may not be entirely stable or supported.CREATE TABLE fluentbit.iis
: This statement creates a new “iis” table within the “fluentbit” database. The table will contain the following columns:log
: This column has the data type JSON
, which means it can store data in JSON format.s_sitename
: This column has the data type String
and stores the site name.s_computername
: This column has the data type String
and stores the computer name.cs_host
: This column has the data type String
and stores the host.date
: This column has the data type Datetime
and stores the date and time.Engine \= MergeTree ORDER BY tuple(date, s_sitename, s_computername, cs_host)
: This specifies the storage engine for the “iis” table as MergeTree
. The MergeTree
engine is a popular ClickHouse storage engine that efficiently handles time-series data. The ORDER BY
clause specifies the primary sorting order of the table, which is based on the columns date
, s_sitename
, s_computername
, and cs_host
.TTL date \+ INTERVAL 3 MONTH DELETE;
: This sets a Time-to-Live (TTL) rule on the table. It means that ClickHouse will automatically delete rows from the table where the date
column is older than three months. This process helps to manage the data and keep the table size under control.We can check that our workflow is properly working by checking the data entry:
SET output_format_json_named_tuples_as_objects = 1;
SELECT log FROM fluentbit.iis
LIMIT 1000 FORMAT JSONEachRow;
Now that we have confirmed that ClickHouse is successfully receiving data from Fluent Bit, we can perform queries that provide us with information about the performance and behavior of our sites.
For example, to get the average of time_taken
, sc_bytes
,** cs_bytes
*
SELECT AVG(log.time_taken) FROM fluentbit.iis;
Another example is grouping by IP. This query is an aggregation on the “fluentbit.iis” table:
SET output_format_json_named_tuples_as_objects = 1;
SELECT COUNT(*),c_ip FROM fluentbit.iis
GROUP BY log.c_ip as c_ip;
SELECT COUNT(\*), c_ip
: This part of the query specifies the columns to select in the result. It retrieves two values: the count of rows (COUNT(\*)
) and the value of the c_ip
column.FROM fluentbit.iis
: This indicates the table to select data from. In this case, it selects data from the “iis” table within the “fluentbit” database.GROUP BY log.c_ip as c_ip
: This clause groups the rows based on the values of the log.c_ip
column and assigns an alias c_ip
to the result. The log.c_ip
represents the c_ip
column within the log
JSON field.SELECT count(*)
FROM fluentbit.iis
WHERE log.sc_status LIKE '4%';
These queries calculate the count of rows that meet a specific condition in the “fluentbit.iis” table:
SELECT count(\*)
: This part of the query specifies that we want to calculate the count of rows that match the given condition.FROM fluentbit.iis
: This part indicates the table we want to retrieve the data from. In this case, it is the “iis” table within the “fluentbit” database.WHERE log.sc_status LIKE '4%'
: This clause specifies the condition that must be satisfied for a row to be included in the count. The LIKE
operator with the pattern 4%
matches any string that starts with 4
. This condition will match statuses starting with 4
, which typically represent client errors in HTTP responses (e.g., 400 Bad Request, 404 Not Found).These and many other queries regarding the collected logs can be performed according to our needs.
Now that our records are stored in a database, we can use a visualization tool like Grafana for analysis rather than relying solely on pure SQL.
ClickHouse makes this process easy by offering a plugin for Grafana. The Grafana plugin allows users to connect directly to ClickHouse, enabling them to create interactive dashboards and visually explore their data.
With Grafana’s intuitive interface and powerful visualization capabilities, users can gain valuable insights and make data-driven decisions more effectively. To learn more about connecting Grafana to ClickHouse, you can find detailed documentation and instructions on the official ClickHouse website:Connecting Grafana to ClickHouse.
The Fluent Bit Wasm filter approach provides us with several powerful advantages inherent to programming languages:
sc_bytes
, cs_bytes
, time_taken
. This is particularly useful when we need to validate our data results.To learn more about Fluent Bit and its powerful data processing and routing capabilities, check out Fluent Bit Academy. It’s filled with on-demand videos guiding you through all things Fluent Bit— best practices and how-to’s on advanced processing rules, routing to multiple destinations, and much more. Here’s a sample of what you can find there:
Your destination for best practices and trainings on all things Fluent Bit
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.
Modern applications generate a massive volume of logs and other telemetry data, which requires an efficient log management solution. Loki, an open-source log aggregation system from Grafana, is a popular solution for companies. It allows for storing, searching, and analyzing huge volumes of data quickly and easily. Loki is part of the Grafana open-source observability stack called LGTM, which stands for Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics. For this blog, we will look into Loki.
Although Grafana offers its own collector agent called Promtail for sending logs to Loki, we’ll demonstrate how to use Fluent Bit, a leading open-source solution for collecting, processing, and routing large volumes of telemetry data.
So, if Promtail is specifically engineered for sending logs to Loki why would we use Fluent Bit instead? In short, because Promtail is specifically engineered for sending logs to Loki. Fluent Bit is a more versatile, flexible, and powerful option.
The vendor neutrality of Fluent Bit provides a significant benefit in terms of flexibility and interoperability. Since it is not tied to any specific vendor or product, it can be easily integrated into any existing technology stack, making it a highly adaptable and versatile solution for log forwarding and processing. This allows organizations to avoid vendor lock-in and choose the best tools for their specific needs rather than being limited to a single vendor’s products.
It can send data to all of the major backends, such as Chronosphere Observability Platform, Elasticsearch, Splunk, Datadog, and more. This helps you to avoid costly vendor lock-in. Transitioning to a new backend is a simple configuration change — no new vendor-specific agent to install across your entire infrastructure. A single Fluent Bit agent can be configured to send data to Loki, Chronosphere, Elasticsearch, Splunk, and Datadog all at the same time.
Fluent Bit is a lightweight and fast solution for sending logs to Loki. It requires fewer system resources and runs faster than other log collection agents. As a result, it can process a massive amount of log data with minimal impact on system performance. Its footprint is only ~ 450kb, but it certainly punches above its weight class when it comes to processing millions of records daily.
Fluent Bit supports multiple platforms, including Windows, Linux, macOS, and Kubernetes, making it an ideal solution for companies with diverse IT environments.
Fluent Bit is open source. Fluent Bit is a graduated Cloud Native Computing Foundation project under the Fluentd umbrella.
Fluent Bit offers a wide range of input and output plugins allowing it to collect and send logs from various sources to various destinations. These plugins include file, syslog, TCP, HTTP, and more.
Fluent Bit is easy to configure and deploy, even for users with limited technical expertise. Its configuration files are written in a simple and easy-to-understand syntax, and it offers extensive documentation and community support.
Fluent Bit is trusted. Fluent Bit has been downloaded and deployed billions of times. In fact, it is included with major Kubernetes distributions, including Google Kubernetes Engine (GKE), AWS Elastic Kubernetes Service (EKS), and Azure Kubernetes Service (AKS).
Now that we know the benefits and why organizations prefer to use Fluent Bit, let’s go over some basic concepts before we get started on the demo.
At a high level, Fluent Bit works by taking data from various sources (Inputs), parsing and transforming that data (Parsers), and then sending it to various destinations (Outputs).
Here’s a brief explanation of each component in Fluent Bit:
Inputs: These are data sources that Fluent Bit reads from. Examples include log files, Docker containers, system metrics, and many more. Fluent Bit supports a wide range of Inputs out of the box.
Parsers: Once Fluent Bit has read the data, it can be transformed using Parsers. These are responsible for recognizing the format of the incoming data, extracting the relevant fields, and mapping them to a structured format. Parsers help make the data more manageable and can even allow for real-time alerting.
Outputs: The last step in Fluent Bit’s data collection process is sending the transformed data to various destinations. Outputs include Elasticsearch, Kafka, Amazon S3, and many more. Fluent Bit allows for the routing and filtering of data before it is sent to the desired destination, making it a powerful tool for managing large amounts of data.
You can read more about Fluent Bit in detail here.
Fluent Bit acts as a bridge between your logs and Loki, which is a horizontally-scalable, highly-available, and multi-tenant log aggregation system. Fluent Bit can parse and structure logs into the format Loki requires , making it easier to search and analyze log data. It can also compress and batch logs, reducing network bandwidth and improving performance.
By using Fluent Bit to send logs to Loki, you can take advantage of Loki’s advanced features, such as query language and alerts, to gain insights into your applications and infrastructure. Fluent Bit can also be configured to use the Loki API to create, update, and delete labels for your log streams, enabling better organization and filtering of log data. Overall, Fluent Bit can help make log collection and analysis more efficient and effective, particularly in large-scale environments.
For this demo, you will need to have Docker and Docker Compose installed. If you don’t have it already installed, you can follow the install docker-compose official documentation, which has very well-articulated steps. Lastly, you need a Grafana Cloud Account — a trial account would work for this demo.
Once you’re done with the installation, let’s look at the configuration for Fluent bit.
Fluent Bit can be configured using a configuration file or environment variables. The configuration file is written in a simple syntax, and it allows for easy management of complex pipelines. Environment variables can also be used to configure Fluent Bit, and they provide a simple way to pass configuration data without needing a configuration file. Once the configuration is set up, Fluent Bit can be run as a standalone process or as a sidecar in containerized environments.
For this demo, we will be going ahead with a configuration file.
[SERVICE]
flush 1
log_level info
Currently, the file only contains information about the service, which defines the global behavior of the Fluent Bit engine. We also need to define input and outputs as well.
Fluent Bit accepts data from a variety of sources using input plugins. The tail
input plugin allows you to read from a text log file as though you were running the tail -f
command
Add the following to your fluent-bit.conf
file:
[SERVICE]
flush 1
log_level info
[INPUT]
name tail
path /etc/data/data.log
tag log_generator
The path
parameter in Fluent Bit’s configuration may need to be adjusted based on the source of your logs. The plugin name, which identifies which plugin Fluent Bit should load, cannot be customized by the user. The tag
parameter is optional but can be used for routing and filtering your data, as discussed in more detail below.
As with inputs, Fluent Bit uses output plugins to send the gathered data to their desired destinations.
To set up your configuration, you will need to gather some information from your Grafana account: See the image below for how to locate it from Grafana Cloud page.
Once you have gathered the required information, add the following to your fluent-bit.conf
file below the Input section.
[SERVICE]
flush 1
log_level info
[INPUT]
name tail
path /etc/data/data.log
tag log_generator
[OUTPUT]
Name stdout
Match *
[OUTPUT]
# for sending logs to local Loki instance
name loki
match *
host loki
port 3100
labels job=fluentbit
[OUTPUT]
# for sending logs to cloud Loki instance
Name loki
Match *
Host HOST_NAME
port 443
tls on
tls.verify on
http_user USER_NAME
line_format json
labels job=fluentbit
http_passwd API_KEY
**Tip: I_f you want to look into more details of each output parameters of Loki plugin you check out [here](https://docs.fluentbit.io/manual/pipeline/outputs/loki)._**
The Match
*
parameter indicates that all of the data gathered by Fluent Bit will be forwarded to Loki instance. We could also match based on a tag defined in the input plugin. tls On ensures that the connection between Fluent Bit and the Loki instance is secure. By default, the Port is configured to 9200, so we need to change that to 9243, which is the port used by Loki Cloud
Note: We have also defined a secondary output that sends all the data to stdout. This is not required for the Loki configuration but can be incredibly helpful if we need to debug our configuration.
For ease of setup, I’ve written a docker-compose file as follows, and it will help you get started with all the necessary things, such as Grafana, Loki, Log generator, and Fluent Bit instance running on local.
version: "3"
networks:
loki:
volumes:
log-data:
driver: local
services:
flog-log:
image: mingrammer/flog
command: "-f json -t log -l -w -d 5s -o /etc/data/data.log"
volumes:
- log-data:/etc/data
fluent-bit:
image: fluent/fluent-bit
volumes:
- ./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- log-data:/etc/data
depends_on:
- loki
networks:
- loki
loki:
image: grafana/loki:2.7.0
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
networks:
- loki
grafana:
image: grafana/grafana:latest
ports:
- "3000:3000"
networks:
- loki
environment:
- GF_PATHS_PROVISIONING=/etc/grafana/provisioning
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
entrypoint:
- sh
- -euc
- |
mkdir -p /etc/grafana/provisioning/datasources
cat < /etc/grafana/provisioning/datasources/ds.yaml
apiVersion: 1
datasources:
- name: Loki
type: loki
access: proxy
orgId: 1
url: http://loki:3100
basicAuth: false
isDefault: true
version: 1
editable: true
EOF
/run.sh
Put the fluent-bit.conf and docker-compose.yaml file in the same directory, and run the command below to get things up and running; make sure you’re running the terminal within the same directory.
➜ demo-fluent-bit docker-compose up
[+] Running 4/0
⠿ Container demo-fluent-bit-flog-log-1 Created 0.0s
⠿ Container demo-fluent-bit-loki-1 Created 0.0s
⠿ Container demo-fluent-bit-grafana-1 Created 0.0s
⠿ Container demo-fluent-bit-fluent-bit-1 Created 0.0s
Attaching to demo-fluent-bit-flog-log-1, demo-fluent-bit-fluent-bit-1, demo-fluent-bit-grafana-1, demo-fluent-bit-loki-1
demo-fluent-bit-loki-1 | level=info ts=2023-02-20T09:45:30.981870161Z caller=main.go:103 msg="Starting Loki" version="(version=2.7.0, branch=HEAD, revision=1b627d880)"
demo-fluent-bit-loki-1 | level=info ts=2023-02-20T09:45:30.982532796Z caller=server.go:323 http=[::]:3100 grpc=[::]:9095 msg="server listening on addresses"
demo-fluent-bit-loki-1 | level=warn ts=2023-02-20T09:45:30.986650564Z caller=cache.go:114 msg="fifocache config is deprecated. use embedded-cache instead"
demo-fluent-bit-loki-1 | level=warn ts=2023-02-20T09:45:30.986689968Z caller=experimental.go:20 msg="experimental feature in use" feature="In-memory (FIFO) cache - chunksembedded-cache"
demo-fluent-bit-loki-1 | level=info ts=2023-02-20T09:45:30.987051548Z caller=table_manager.go:404 msg="loading local table index_19408"
Once, all the services are up, as defined in docker-compose, you can head over to localhost:3000/ this where we port-forwarded our Grafana instance with Loki as a data source.
Check out the below screenshot where Logs are coming to cloud Loki.
At the same time, it is coming to our local instance of Loki; since we’ve already added Loki as a datasource, we can explore that in Grafana now.
That’s it, and you have successfully built your own logs pipeline.
We’ve just seen a basic configuration for getting log data from Fluent Bit into Loki in Grafana Cloud. The Fluent Bit Loki output plugin supports many additional parameters that enable you to fine-tune your Fluent Bit to the Grafana Loki pipeline. Check out the Fluent Bit documentation for more.
To learn even more about Fluent Bit, check out Fluent Bit Academy, your destination for best practices and how-to’s on advanced processing, routing, and all things Fluent Bit. Here’s a sample of what you can find there:
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.
As we have written previously, having access to Kubernetes metadata can enhance traceability and significantly reduce mean time to remediate (MTTR). However, the metadata you need may not be included in the logs. The Fluent Bit Kubernetes filter plugin makes it easy to enrich your logs with the metadata you need to troubleshoot issues.
When run in Kubernetes (K8s) as a daemonset, Fluent Bit can ingest Kubelet logs and enrich them with additional metadata from the Kubernetes API server. This includes any annotations or labels on the pod and information about the namespace, pod, and the container the log is from. It is very simple to do, and, in fact, it is also the default setup when deploying Fluent Bit via the helm chart.
The documentation goes into the full details of what metadata are available and how to configure the Fluent Bit Kubernetes filter plugin to gather them. In this post, we’ll give an overview of how the filter works and provide common troubleshooting tips, particularly with issues caused by misconfiguration.
Let us take a step back and look at what information is required to query the K8s API server for metadata about a particular pod. We need two things:
Cunningly, the Kubelet logs on the node have to provide this information in their filename by design. This information enables Fluent Bit to query the K8s API server when all it has is the log file. Therefore, given a pod log file(name), we should be able to query the K8s API server for the rest of the metadata describing the pod.
First off, we need the actual logs from the Kubelet. This is typically done by using a daemonset to ensure a Fluent Bit pod runs on every node and then mounts the Kubelet logs from the node into the pod.
Now that we have the log files themselves, we should be able to extract enough information to query the K8s API server. We do this with a default setup using the tail plugin to read the log files and inject the filename into the tag:
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
multiline.parser docker, cri
Wildcards in the tag are handled in a special way for the tag filter. This configuration injects the full path and filename for the log file into the tag after the kube. prefix.
Once the kubernetes filter receives these records, it parses the tag to extract the information required. To do so, it needs the kube_tag_prefix
value to strip off any redundant tag or path to leave just the log filename with the three things required to query the K8s API server. Using the defaults would look like this:
[FILTER]
Name kubernetes
Match kube.*
Kube_Tag_Prefix kube.var.log.containers.
Fluent Bit inserts the extra metadata from the K8s API server under the top-level kubernetes
key.
Using an example, we can see how this flows through the system.
Assume this is our log file:
/var/log/container/apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log
The resulting tag would be (slashes are replaced with dots):
kube.var.log.containers.apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log
We then strip off the kube_tag_prefix
:
apache-logs-annotated_default_apache-aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6.log
Now we can extract the relevant fields with a regex:
(?<pod_name>[a-z0-9](?:[-a-z0-9]*[a-z0-9])?(?:\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*)_(?<namespace_name>[^_]+)_(?<container_name>.+)-(?<container_id>[a-z0-9]{64})\.log$
apache-logs-annotated
default
apache
aeeccc7a9f00f6e4e066aeff0434cf80621215071f1b20a51e8340aa7c35eac6
The Fluent Bit Kubernetes filter extracts information from the log filename in order to query the K8s API server to retrieve metadata that is then added to the log file.
While the Fluent Bit Kubernetes filter takes care of the hard part of extracting the K8s metadata from the API server and adding them to the logs, when users experience difficulty, it is usually the result of misconfiguration. There are a few common errors that we frequently see in community channels:
Let’s discuss how to identify these issues and correct them.
The most common problems occur when the default tag is changed for the tail input plugin or when a different path is used for the logs. When this happens, the kube_tag_prefix must also be changed to ensure it strips everything off except the filename.
The kubernetes filter will otherwise end up with a garbage filename that it either complains about immediately, or it injects invalid data into the request to the K8s API server. In either case, the filter will not enrich the log record as it has no additional data to add.
Typically, you will see a warning message in the log if the tag is obviously wrong, or with log_level debug, you can see the requests to the K8s API server with invalid pod name or namespace plus the response indicating there is no such pod.
$ kubectl logs fluent-bit-cs6sg
…
[2023/11/30 10:08:14] [debug] [filter:kubernetes:kubernetes.0] Send out request to API Server for pods information
[2023/11/30 10:08:14] [debug] [http_client] not using http_proxy for header
[2023/11/30 10:08:14] [debug] [http_client] server kubernetes.default.svc:443 will close connection #60
[2023/11/30 10:08:14] [debug] [filter:kubernetes:kubernetes.0] Request (ns=default, pod=s.fluent-bit-cs6sg) http_do=0, HTTP Status: 404
[2023/11/30 10:08:14] [debug] [filter:kubernetes:kubernetes.0] HTTP response
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"s.fluent-bit-cs6sg\" not found","reason":"NotFound","details":{"name":"s.fluent-bit-cs6sg","kind":"pods"},"code":404}
This example was created using a configuration file like below for the official helm chart. As you can see we have added two characters to the default tag prefix (my) and you can see above in the details for the error that the name of the pod has two extra characters in the prefix: it should be fluent-bit-cs6sg but is s.fluent-bit-cs6sg, no such pod exists so it reports a failure. Without log_level debug you just get no metadata.
config:
service: |
[SERVICE]
Daemon Off
Log_Level debug
inputs: |
[INPUT]
Name tail
Path /var/log/containers/*.log
multiline.parser docker, cri
Tag mykube.*
Mem_Buf_Limit 5MB
Skip_Long_Lines On
filters: |
[FILTER]
Name kubernetes
Match *
Using wildcards in the tail input plugin can trip you up sometimes: the * wildcard is replaced by the full path of the file but with any special characters (e.g. /) replaced with dots (.).
Beware of modifying the default kube.* tag in this case, and — as I try to stress as much as possible — use stdout to see the actual tags you are getting if you have any issues. As an example, consider the following tail configuration:
[INPUT]
Name tail
Path /var/log/containers/*.log
Now, you will get tags that look like this depending on what you configure:
In the second case, notice that we have an underscore followed by a dot. Whereas, in the first case, there is no double dot as it is automatically collapsed by the input plugin. This can mean your filters do not match later on and can cause confusing problems. The first step is always the trusty stdout output, though, to verify.
The Fluent Bit pod must have the relevant roles added to its service account that allow it to query the K8s API for the information it needs. Unfortunately, this error is typically just reported as a connectivity warning to the K8s API server, so it can be easily missed.
To troubleshoot this issue, use log_level debug to see the response from the K8s API server. The message will basically say “missing permissions to do X” or something similar and then it is obvious what is wrong.
[2022/12/08 15:53:38] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2022/12/08 15:53:38] [debug] [filter:kubernetes:kubernetes.0] Send out request to API Server for pods information
[2022/12/08 15:53:38] [debug] [http_client] not using http_proxy for header
[2022/12/08 15:53:38] [debug] [http_client] server kubernetes.default.svc:443 will close connection #23
[2022/12/08 15:53:38] [debug] [filter:kubernetes:kubernetes.0] Request (ns=default, pod=calyptia-cluster-logging-316c-dcr7d) http_do=0, HTTP Status: 403
[2022/12/08 15:53:38] [debug] [filter:kubernetes:kubernetes.0] HTTP response
{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods \"calyptia-cluster-logging-316c-dcr7d\" is forbidden: User \"system:serviceaccount:default:default\" cannot get resource \"pods\" in API group \"\" in the namespace \"default\"","reason":"Forbidden","details":{"name":"calyptia-cluster-logging-316c-dcr7d","kind":"pods"},"code":403}
[2022/12/08 15:53:38] [ warn] [filter:kubernetes:kubernetes.0] could not get meta for POD calyptia-cluster-logging-316c-dcr7d
In the example above you can see without log_level debug
all you will get is the warning message:
[2022/12/08 15:53:38] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2022/12/08 15:53:38] [ warn] [filter:kubernetes:kubernetes.0] could not get meta for POD calyptia-cluster-logging-316c-dcr7d
Kubernetes has evolved over the years, and new container runtimes have also come along. As a result, the filename requirements for Kubelet logs may be handled using a symlink from a correctly named pod log file to the actual container log file created by the container runtime. When mounting the pod logs into your container, ensure they are not dangling links and that their destination is also correctly mounted.
Fluent Bit caches the response from the K8s API server to prevent rate limiting or overloading the server. As a result, if annotations or labels are applied or removed dynamically, then those changes will not be seen until the next time the cache is refreshed. A simple test is just to roll/delete the pod so a fresh one is deployed and check if it picks up the changes.
Another common misconfiguration is using custom container runtime parsers in the tail input. This problem is generally a legacy issue as previously, there were no inbuilt CRI or docker multiline parsers. The current recommendation is always to configure the tail input using the provided parsers as per the documentation:
[INPUT]
name tail
path /var/log/containers/*.log
multiline.parser docker, cri
Do not use your own CRI or docker parsers, as they must cope with merging partial lines (identified with a P instead of an F).
The parsers for the tail plugin are not applied sequentially but are mutually exclusive, with the first one matching being applied. The goal is to handle multiline logs created by the Kubelet itself. Later, you can have another filter to handle multiline parsing of the application logs themselves after they have been reconstructed here.
To learn more about Fluent Bit, we recommend joining the Fluent Community Slack channel where you will find thousands of other Fluent Bit users. Engage with experts, ask questions, and share best practices. Many of the troubleshooting tips in this blog were originally surfaced in the Slack channel.
Join the Fluent Community Slack channel
We also invite you to download a free copy of Fluent Bit with Kubernetes by Phil Wilkins. This practical guide to monitoring cloud native and traditional environments with Fluent Bit covers the basics of collecting app logs, filtering, routing, enriching, and transforming logs, metrics, and traces.
Download Fluent Bit with Kubernetes
Fluent Bit is a widely-used open-source data collection agent, processor, and forwarder that enables you to collect logs, metrics, and traces from various sources, filter and transform them, and then forward them to multiple destinations.
In fact, if you are using Kubernetes on a public cloud provider odds are that you are already running Fluent Bit. Fluent Bit is deployed by default in major Kubernetes distributions, including Google Kubernetes Engine (GKE), AWS Elastic Kubernetes Service (EKS), and Azure Kubernetes Service (AKS) and is used to route data to the cloud providers’ backend (e.g. CloudWatch, Azure Monitor Logs, or Google Cloud Logging).
In this post, we will demonstrate how to collect logs from different Kubernetes components (such as kube-proxy, kubelet, etc.) and send them to the destination of your choice. We’ll send them to Elasticsearch, but Fluent Bit also supports more than 40 other destinations.
Note: This guide exclusively focuses on the logs of Kubernetes components. If you wish to collect logs from application containers running on Kubernetes, please refer to this guide.
Let’s begin by identifying the names of the Kubernetes components from which we need to collect the logs.
According to the official documentation of Kubernetes, Kubernetes clusters are generally composed of the following:
The above lists contain the typical components you find in a Kubernetes cluster, but it’s not necessarily an accurate starting point. Various flavors of Kubernetes exist, such as Self-Hosted, Managed Services, Openshift, Cluster API, etc. As a result, the specific component list might differ depending on the cluster you are working with.
For example, since we are using a managed Kubernetes cluster (EKS) from AWS, we don’t have control over the control plane components. These are entirely managed by AWS, and the logs of these control plane nodes are available in CloudWatch directly.
However, we do have control over the worker nodes. So our component list is as shown below:
Suppose you were using a self-hosted Kubernetes cluster on-premises. In that case, your list would include all the components we mentioned earlier.
Moving forward, the new list we’ve outlined has another complexity: Kubernetes offers two options for running the control plane components. They can be executed either as a server in the host or a Kubernetes pod in a worker node (see Kubernetes docs).
For our EKS cluster, the kubelet
and container-runtime
run as daemon processes on the host machine, while the kube-proxy
and cni-plugin
run as Kubernetes pods.
Below is our final list for EKS components with some additional information attached to it.
# Below components run as Daemon Processes
1. Kubelet
Service Name: "kubelet.service"
2. Container Runtime (containerd)
Service Name: "containerd.service"
# Below components run as Containers
1. Kube Proxy
Namespace: "kube-proxy"
Resource: "daemonset"
Resource Name: "kube-proxy"
2. CNI Plugin (VPC CNI)
Namespace: "kube-proxy"
Resource: "daemonset"
Resource Name: "aws-node"
To summarize, here’s a three-step process for selecting the components from which to gather logs:
With the components list ready, it’s time to configure Fluent Bit.
From our components list, we can see that we have two different types of data sources:
Fluent Bit offers an input plugin for each of these data sources.
Containers store their logs in plain text files, which can be read by standard programs (like cat
, tail
, etc.). The Tail Plugin operates similarly to the Linux tail
command, where you specify the file path as an argument to read a specific file. In this context, the plugin takes the Path
as a parameter to read files on the host machine. Since we’re using containerd as our container runtime, pod logs are stored in a nested directory structure at /var/log/pods
.
/var/log/pods/ # Root directory for pod logs
|
|-- <namespace>_<pod-name>_<pod-uuid>/ # Directory for a specific pod
| |
| |-- <container-name>/ # Directory for a specific container within the pod
| |
| |-- 0.log # Log file for the container's first attempt (can increment for restarts)
However, we will utilize the /var/log/containers
directory, which contains symbolic links to all files in the /var/logs/pods
directory. This directory is preferred as it stores files in a flat structure, with no nested directories.
To select only aws-node
and kube-proxy
log files from many others in the /var/log/containers
directory, we’ll leverage Linux pattern matching feature. Observing the file names, we can create a pattern that selects specific files using *<namespace-name>_<pod-name>*
. Our final paths for the log files will look like: /var/log/containers/*kube-system_kube-proxy*
, /var/log/containers/*kube-system_aws-node*
[INPUT]
Name tail
Tag kubernetes.core.containers*
Path /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
multiline.parser docker, cri
Read_from_Head true
For more information on the Tail Plugin of Fluent Bit, follow the official documentation.
On Linux machines, daemon processes are controlled using the systemctl
CLI. These processes store logs in a binary format in the /var/log/journal
directory. Since the Tail Plugin cannot read these binary files directly, we use the systemd
plugin, which handles format conversion and displays logs in a human-readable format. This plugin provides the Systemd_Filter
parameter to specify the specific service name from which to read logs.
Our Fluent Bit configuration for the systemd plugin aligns with our component list as shown below:
[INPUT]
Name systemd
Tag kubernetes.*
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Systemd_Filter _SYSTEMD_UNIT=containerd.service
Note: If you specify a service that does not exist, Fluent Bit will implicitly ignore it.
For more information on the Systemd Plugin of Fluent Bit, follow the official documentation.
Combining both plugins, our final Fluent Bit input configuration will look like this:
[INPUT]
Name tail
Tag kubernetes.core.containers*
Path /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
multiline.parser docker, cri
Read_from_Head true
[INPUT]
Name systemd
Tag kubernetes.*
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Systemd_Filter _SYSTEMD_UNIT=containerd.service
Note: The above configuration is derived from our components list, which might be different for a different Kubernetes cluster. This results in different configuration values, but they will be along the same lines.
With our configuration ready, let’s move forward and begin implementing it in Kubernetes.
We will deploy Fluent Bit using the Helm chart available at Fluent Bit Helm Chart.
1) Add Fluent Bit Helm Repo
Use the command below to add the Fluent Bit Helm repository:
helm repo add fluent <https://fluent.github.io/helm-charts>
2) Configure Fluent Bit
The default Helm chart configuration of Fluent Bit reads container logs and sends them to an Elasticsearch cluster. Before sending logs to Elasticsearch, we would like to test the configuration, so we have added a stdout output plugin to view logs in stdout itself for verification.
3) Override the default configuration
Create a file called values.yaml
with the following contents:
config:
inputs: |
[INPUT]
Name systemd
Tag kubernetes.*
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Systemd_Filter _SYSTEMD_UNIT=containerd.service
[INPUT]
Name tail
Tag kubernetes.containers*
Path /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
multiline.parser docker, cri
Read_from_Head true
outputs: |
[OUTPUT]
Name stdout
Match *
4) Deploy Fluent Bit
Use the command below:
helm upgrade -i fluent-bit fluent/fluent-bit --values values.yaml
5) Wait for Fluent Bit pods to run
Ensure that the Fluent Bit pods reach the Running state.
kubectl get pods
6) Verify Fluent Bit is working
Use the command below to verify that Fluent Bit is reading the logs of the Kubernetes components that we configured:
kubectl logs <fluent-bit-pod-name> -f
Search the logs for the components mentioned in our list in the output such as containerd
and kube-proxy
.
If you are unable to view the logs of any of the expected Kubernetes components, check the following:
/var/log/
and /etc/machine-id
.systemctl | grep "kubelet.service\\|containerd.service"
(or alternately systemctl status <service-name>
).With a working configuration, we can now send this data to any available Fluent Bit output plugins. Since we decided to send data to Elasticsearch, modify the output configuration in values.yaml
by adding the es output plugin, and then apply the configuration using helm apply.
For more information on ES plugin check the official documentation, or see our previous post for a step-by-step tutorial for sending logs to Elasticsearch.
config:
inputs: |
[INPUT]
Name systemd
Tag kubernetes.*
Systemd_Filter _SYSTEMD_UNIT=kubelet.service
Systemd_Filter _SYSTEMD_UNIT=containerd.service
[INPUT]
Name tail
Tag kubernetes.containers*
Path /var/log/containers/*kube-system_kube-proxy*,/var/log/containers/*kube-system_aws-node*
multiline.parser docker, cri
Read_from_Head true
outputs: |
[OUTPUT]
Name es
Match *
Host <your-host-name-of-elastic-search>
Logstash_Format On
Retry_Limit False
And you should be able to see logs in Elasticsearch.
This guide provided an overview of configuring and deploying Fluent Bit in a Kubernetes environment to manage logs from containers and daemon processes. By utilizing the Helm chart along with specific input and output plugins, administrators can streamline log management and forwarding, ensuring vital data is accessible and analyzable.
If you would like to learn more about using Fluent Bit with containerized cloud native environments, we invite you to download a free copy of Fluent Bit with Kubernetes by Phil Wilkins. In this 300+ page book, you’ll learn how to establish and optimize observability systems for Kubernetes, and more. From fundamental configuration to advanced integrations, this book lays out Fluent Bit’s full capabilities for log, metric, and trace routing and processing.
Download Fluent Bit with Kubernetes
Originally created at Treasure Data more than a decade ago, Fluentd is a widely adopted open source data collection project. Both Fluentd and Fluent Bit, a related project specifically created for containerized environments, utilize Forward Protocol for transporting log messages. In this post, we’ll explain Forward Protocol and its uses and provide an example of using Fluentd and Fluent Bit together to collect and route logs to a MongoDB database for storage.
Forward Protocol is a network protocol used by Fluentd to transport log messages between nodes. It is a binary protocol that is designed to be efficient and reliable. It uses TCP to transport messages and UDP for the heartbeat to check the status of servers.
It is a lightweight and efficient protocol that allows for the transmission of logs across different nodes or systems in real time. The protocol also supports buffering and retransmission of messages in case of network failures, ensuring that log data is not lost.
Fluentd Forward Client and Server are two components of the Fluentd logging system that work together to send and receive log data between different sources and destinations. It uses message pack arrays to communicate and also has options for authentication and authorization to ensure only authorized entities have access to send & receive logs. Read more about the forward protocol.
Forward protocol offers the following benefits:
Apart from Fluentd and Fluent Bit, there’s also a Docker log driver that uses Forward Protocol to send container logs to the Fluentd collector.
The OpenTelemetry Collector also supports receiving logs using Forward Protocol.
Both Fluentd and Fluent Bit are popular logging solutions in the cloud-native ecosystem. They are designed to handle high volumes of logs and provide reliable log collection and forwarding capabilities. Fluent Bit is lightweight and more suitable for edge computing and IoT use cases.
In this section, we’ll take a closer look at the differences between the two tools and understand a use case when you’d want to use both of them together.
Fluentd | Fluent Bit | |
---|---|---|
Scope | Containers / Servers | Embedded Linux / Containers / Servers |
Language | C & Ruby | C |
Memory | > 60MB | ~1MB |
Performance | Medium Performance | High Performance |
Dependencies | Built as a Ruby Gem, it requires a certain number of gems. | Zero dependencies, unless some special plugin requires them. |
Plugins | More than 1000 external plugins are available | More than 100 built-in plugins are available |
License | Apache License v2.0 | Apache License v2.0 |
Both Fluent Bit and Fluentd can be used as forwarders or aggregators and can be used together or as a standalone solution. One use case for using Fluent Bit and Fluentd together is by using Fluent Bit to collect logs from containerized applications running in a Kubernetes cluster. Because Fluent Bit has a very small footprint, it can be deployed on every node. Meanwhile, Fluentd can be used for collecting logs from various sources outside of Kubernetes, such as servers, databases, and network devices.
Ultimately, the choice between Fluentd and Fluent Bit depends on the specific needs and requirements of the use case at hand.
In the next section, we’ll explore how we can use Forward Protocol to push data from Fluent Bit to Fluentd.
Read more about Fluent Bit and Fluentd use cases
To understand how Forward Protocol works, we’ll set up instances of both Fluent Bit and Fluentd. We’ll collect CPU logs using Fluent Bit, and, using Forward Protocol, we’ll send them to Fluentd. From there, we will push the logs to MongoDB Atlas.
MongoDB Atlas is a cloud-based database service that allows users to easily deploy, manage, and scale MongoDB databases. It offers features such as automatic backups, monitoring, and security, making it a convenient and reliable option for managing data in the cloud. Hence, we’ll be pushing our logs to MongoDB from Fluentd.
In order to do that, we need to do the following:
Apart from this, you might also have to do the following:
The first step is to configure Fluentd to receive input from a forward source. After you install Fluentd, you need to update the configuration file with the following:
<source>
type forward
bind 0.0.0.0
port 24224
</source>
<match fluent_bit>
type stdout
</match>
<match fluent_bit>
@type mongo
database fluentd
collection fluentdforward
connection_string "mongodb+srv://fluentduser:[email protected]/test?retryWrites=true&w=majority"
</match>
In the above configuration, we are defining the source type to be forward and providing a bind address and port. We’re also providing a match filter which is `fluent_bit`, so any log it finds with this tag will be consumed. The input logs will be sent to MongoDB Atlas for which we’ve provided the database, collection, and connection_string.
After this, all you need to do is start the Fluentd service if it is not running already. It will not show any output at the moment since we have not yet configured Fluent Bit to forward the logs.
On the Fluent Bit side, we need to configure the INPUT and OUTPUT plugins.
INPUT
[INPUT]
Name cpu
Tag fluent_bit
[INPUT]
Name kmsg
Tag fluent_bit
[INPUT]
Name systemd
Tag fluent_bit
With this, we are collecting the CPU, kernel, and systemd logs and applying a `fluent_bit` tag to them.
OUTPUT
[OUTPUT]
Name forward
Match *
Host 127.0.0.1
Port 24224
For output, we’re using a forward output plugin that routes the logs to the specified host and port.
Save the configuration and restart the Fluent Bit service. If everything is correct, you’ll see the logs being streamed by Fluentd. Navigate to your MongoDB Atlas UI and refresh the collection, you should be able to see the logs as shown below.
This way we are able to make use of the forward plugin and share logs between Fluent Bit and FluentD. You can use Forward Protocol with other products that support it to gather logs from different sources and push them to different tools.
To learn more about Fluent Bit and its powerful data processing and routing capabilities, check out Fluent Bit Academy. It’s filled with on-demand videos guiding you through all things Fluent Bit— best practices and how-to’s on advanced processing rules, routing to multiple destinations, and much more. Here’s a sample of what you can find there:
Your destination for best practices and training on all things Fluent Bit
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.
Logs are the foundational data of any observability effort. They provide information about every event and error in your applications, making them essential for troubleshooting. Elasticsearch allows us to store, search, and analyze huge volumes of data quickly, making it ideal for the massive volumes of log and other telemetry data generated by modern applications. It is also one of the components of the ELK Stack (Elasticsearch, Logstash, and Kibana), a widely-used log management solution.
Fluent Bit is the leading open source solution for collecting, processing, and routing large volumes of telemetry data, including logs, traces, and metrics. When used as the agent for sending logs to Elasticsearch, you have a highly performative telemetry pipeline.
In this post, we will show you how to send logs to Elasticsearch using Fluent Bit.
This tutorial assumes that you already have Fluent Bit installed and running on your source and that you have Elasticsearch.
For this tutorial, we will run Fluent Bit on an EC2 instance from AWS running Amazon Linux2 and send the logs to Elastic Cloud, Elastic’s hosted service. The configurations you use will vary slightly depending on your source and whether you are using Elastic Cloud or another version of Elasticsearch
Fluent Bit accepts data from a variety of sources using input plugins. The Tail input plugin allows you to read from a text log file as though you were running the tail -f
command
Add the following to your fluent-bit.conf
file.
[INPUT]
Name tail
Path /var/log/*.log
Tag ec2_logs
Depending upon your source, you may need to adjust the Path
parameter to point to your logs. Name
identifies which plugin Fluent Bit should load, and is not customizable by the user. Tag
is optional but can be used for routing and filtering your data (more on that below).
As with inputs, Fluent Bit uses output plugins to send the gathered data to their desired destinations.
To set up your configuration, you will need to gather some information from your Elasticsearch deployment:
Once you have gathered the required information, add the following to your fluent-bit.conf
file below the Input
section.
[OUTPUT]
Name es
Match *
Host https://sample.es.us-central1.gcp.cloud.es.io
Cloud_auth elastic:yRSUzmsEep2DoGIyNT7bFEr4
Cloud_id sample:dXMtY2VudHJhbDEuZ2NwLmNsb3VkLmVzLmlvOjQ0MyQ2MDA4NjljMjA4M2M0ZWM2YWY2MDQ5OWE5Y2Y3Y2I0NCQxZTAyMzcxYzAwODg0NDJjYWI0NzIzNDA2YzYzM2ZkYw==
Port 9243
tls On
tls.verify Off
[OUTPUT]
# optional: send the data to standard output for debugging
name stdout
match *
Be sure to update the values to match your own Elastic account details.
The host is your Elasticsearch endpoint. Cloud_Auth
corresponds to your authentication credentials and must be presented as user:password.
The Match *
parameter indicates that all of the data gathered by Fluent Bit will be forwarded to Elasticsearch. We could also match based on a tag defined in the input plugin. tls On
ensures that the connection between Fluent Bit and the Elasticsearch cluster is secure. By default, the Port
is configured to 9200, so we need to change that to 9243, which is the port used by Elastic Cloud
We have also defined a secondary output that sends all the data to stdout
. This is not required for the Elasticsearch configuration but can be incredibly helpful if we need to debug our configuration.
Once you have saved the changes to your fluent-bit.conf
file, you’ll need to restart Fluent Bit to allow the new configuration to take effect:
sudo systemctl restart fluent-bit
Note: If Fluent Bit is configured to utilize its optional Hot Reload feature, you do not have to restart the service.
Check to make sure Fluent Bit restarted correctly.
systemctl status fluent-bit
Again, these commands may differ depending on your system.
Your logs should now be flowing into Elasticsearch, and you should be able to search your data.
We’ve just seen a basic configuration for getting log data from an AWS EC2 instance into Elasticsearch in Elastic Cloud. The Fluent Bit Elasticsearch output plugin supports many additional parameters that enable you to fine-tune your Fluent Bit to Elasticsearch pipeline, including options for using Amazon Open Search. Check out the Fluent Bit documentation for more.
Fluent Bit also allows you to process the data before routing it to its final destination. You can, for example:
Routing is particularly powerful as it allows you to redirect non-essential data to cheaper storage (or even drop it entirely), potentially saving you thousands of dollars when using costly storage and analysis applications priced by consumption.
You may be asking yourself why you should use Fluent Bit rather than Elastic Agent. It’s a fair question.
Fluent Bit is vendor-neutral. Fluent Bit doesn’t care what backend you are using. It can send data to all of the major backends, including Elasticsearch, Chronosphere Observability Platform, Splunk, Datadog, and more. This helps you to avoid costly vendor lock-in. Transitioning to a new backend is a simple configuration change—no new vendor-specific agent to install across your entire infrastructure.
Fluent Bit is lightweight. It was created to be a lightweight, highly performant alternative to Fluentd designed for containerized and IOT deployments. Although its footprint is only ~ 450kb, it certainly punches above its weight class when it comes to processing millions of records daily.
Fluent Bit is open source. Fluent Bit is a graduated Cloud Native Computing Foundation project under the Fluentd umbrella.
Fluent Bit is trusted. Fluent Bit has been downloaded and deployed billions of times. In fact, it is included with major Kubernetes distributions, including Google Kubernetes Engine (GKE), AWS Elastic Kubernetes Service (EKS), and Azure Kubernetes Service (AKS).
As we have seen, Fluent Bit is a powerful component of your telemetry pipeline and is relatively simple to configure manually. However, such manual configuration becomes untenable as your infrastructure scales to dozens, hundreds, or even thousands of sources.
Chronosphere Telemetry Pipeline, from the creators of Fluent Bit and Calyptia, streamlines log collection, aggregation, transformation, and routing from any source to any destination. Telemetry Pipeline also simplifies fleet operations by automating and centralizing the installation, configuration, and maintenance of Fluent Bit agents across thousands of machines.
This allows companies who are dealing with high costs and complexity the ability to control their data and scale their growing business needs.
Read more about Telemetry Pipeline
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.