Administrators using Internet Information Services (IIS) to host websites know that IIS logs can be difficult to search and analyze, especially when they are under pressure to identify the cause of an outage or performance issues. Event Viewer, the Windows default application for searching and analyzing logs is unintuitive for users. Many users prefer other tools, which typically require converting the logs into JSON. Although IIS logs are flat files, with each line containing data about an individual web hit, similar to Apache or Nginx, IIS logs are not as easy to format into JSON as Apache and Nginx logs.
In this post, we’ll demonstrate a better approach to IIS logging. We’ll show you how to configure IIS to enrich the logs with non-standard metadata. We’ll then collect the logs with Fluent Bit where we will use a custom Wasm plugin to transform and enrich the data. Finally, we’ll have Fluent Bit route our data (now formatted as JSON) to ClickHouse for storage where we can then extract it using Grafana for visualization and analysis.
Fluent Bit is a fast, lightweight, and highly scalable log, metric, and trace processor and forwarder that has been deployed billions of times. It is a Cloud Native Computing Foundation graduated open-source project with an Apache 2.0 license.
Fluent Bit uses a pluggable architecture, enabling new data sources and destinations, processing filters, and other new features to be added with approved plugins. Although there are dozens of supported plugins, there may be times when no out-of-the-box plugin accomplishes the exact task you need.
Thankfully, Fluent Bit lets developers write custom scripts using Lua or WebAssembly for such instances.
WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications. Developer reference documentation for Wasm can be found on MDN’s WebAssembly pages.
This post covers how Wasm can be used with Fluent Bit to implement custom logic and functionalities.
To achieve the desired outcomes, several tasks need to be addressed. Firstly, data validation should be implemented to ensure the accuracy and integrity of the processed information. Additionally, we should perform type conversion to ensure compatibility and consistency across different data formats.
Moreover, integrating external resources such as APIs or databases can enhance the logs by providing additional relevant information. It is crucial to apply backward compatibility, maintainability, and testability principles to the source code to ensure its longevity and ease of future modifications.
Specifically, we’ll demonstrate how to collect and parse Internet Information Services (IIS) w3c logs (with some custom modifications) and transform the raw string into a structured JSON record.
Organizations need to collect and parse logs generated by IIS (Internet Information Services). In this particular use case, we will explore the significance of utilizing the Fluent Bit WebAssembly (Wasm) plugin to create custom modifications for logs collected in the w3c format.
By leveraging the Fluent Bit Wasm plugin, organizations can enhance their log processing capabilities by implementing tailored transformations and enrichments specific to their requirements. This ability empowers them to extract valuable insights and gain a deeper understanding of their IIS logs, enabling more effective troubleshooting, monitoring, and analysis of their web server infrastructure.
The following diagram provides an overview of the actions we will take:
This diagram highlights an interesting aspect, namely the introduction of WebAssembly in Fluent Bit. In previous versions of Fluent Bit, the workflow for this use case was relatively straightforward. Log information was extracted using parsers that relied on regular expressions or Lua code.
However, with the introduction of the Wasm plugin, Fluent Bit now offers a more versatile and powerful approach to log extraction and processing. Wasm enables the implementation of custom modifications and transformations, allowing for greater flexibility and efficiency in handling log data. This advancement in Fluent Bit’s capabilities opens up new possibilities for extracting and manipulating log information, ultimately enhancing the overall log processing workflow.
Currently, Fluent Bit offers an ecosystem of plugins, filters, and robust parsers through which you can perform pipelines and routing of different workflows.
It is possible to create parsers using regular expressions and components using programming languages such as C, Golang, and Rust using Wasm.
Our use case shows how to use Rust to develop a Wasm plugin.
Note: One of the reasons for using Rust as a programming language is that I previously developed a PoC project to learn Rust. The idea was to create a Fluent Bit-inspired log collector for IIS files, parse the logs, and send them to various destinations (Kafka, Loki, Postgres). Having that code base turned out to be interesting for combining existing logic with the proposal offered by Fluent Bit to integrate Rust with Wasm into its ecosystem.
Configure the IIS log output standard:
By default, IIS w3c logs include fields that may not always provide relevant information for defining usage metrics and access patterns. Additionally, these logs may not cover custom fields specific to our use case.
One example is the c-authorization-header
field, which is essential for our analysis but not included in the default log format. Therefore, it becomes necessary to customize the log configuration to include this field and any other relevant custom fields crucial to our specific requirements.
This customization ensures we can access all the necessary information to accurately define metrics and gain insights into our IIS server’s usage and access patterns.
date time s-sitename s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port c-ip cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-bytes cs-bytes time-taken c-authorization-header.
To get started, we need to create a new project to construct the filter. Following the official documentation, run this command in our terminal:
cargo new flb_filster_plugin –lib
The command cargo new flb_filter_plugin --lib
is used in the Rust programming language to create a new project. The “–lib” flag specifies that the project should be created as a library project, which is suitable for developing Fluent Bit filter plugins.
Next, open the Cargo.toml file and add the following section:
[lib]
crate-type = ["cdylib"]
[dependencies]
serde = { version = “1.0.160”, features = [“derive”] } serde_json = “1.0.104” serde_bytes = “0.11” rmp-serde = “1.1” regex = “1.9.2” chrono = “0.4.24” libc = “0.2”
Next, open up src/lib.rs and overwrite it with the following entry point code. We will explain the code in the following section.
#[no_mangle]
pub extern "C" fn flb_filter_log_iis_w3c_custom(
tag: *const c_char,
tag_len: u32,
time_sec: u32,
time_nsec: u32,
record: *const c_char,
record_len: u32,
) -> *const u8 {
let slice_tag: &[u8] = unsafe { slice::from_raw_parts(tag as *const u8, tag_len as usize) };
let slice_record: &[u8] =
unsafe { slice::from_raw_parts(record as *const u8, record_len as usize) };
let mut vt: Vec<u8> = Vec::new();
vt.write(slice_tag).expect("Unable to write");
let vtag = str::from_utf8(&vt).unwrap();
let v: Value = serde_json::from_slice(slice_record).unwrap();
let dt = Utc.timestamp_opt(time_sec as i64, time_nsec).unwrap();
let time = dt.format("%Y-%m-%dT%H:%M:%S.%9f %z").to_string();
let input_logs = v["log"].as_str().unwrap();
let mut buf=String::new();
if let Some(el) = LogEntryIIS::parse_log_iis_w3c_parser(input_logs) {
let log_parsered = json!({
"date": el.date_time,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"s_ip": el.s_ip,
"cs_method": el.cs_method,
"cs_uri_stem": el.cs_uri_stem,
"cs_uri_query": el.cs_uri_query,
"s_port": el.s_port,
"c_ip": el.c_ip,
"cs_user_agent": el.cs_user_agent,
"cs_cookie": el.cs_cookie,
"cs_referer": el.cs_referer,
"cs_host": el.cs_host,
"sc_status": el.sc_status,
"sc_bytes": el.sc_bytes.parse::<i32>().unwrap(),
"cs_bytes": el.cs_bytes.parse::<i32>().unwrap(),
"time_taken": el.time_taken.parse::<i32>().unwrap(),
"c_authorization_header": el.c_authorization_header,
"tag": vtag,
"source": "LogEntryIIS",
"timestamp": format!("{}", time)
});
let message = json!({
"log": log_parsered,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"cs_host": el.cs_host,
"date": el.date_time,
});
buf= message.to_string();
}
buf.as_ptr()
}
This Rust code defines a function called flb_filter_log_iis_w3c_custom
, which is intended to be used as a filter plugin in Fluent Bit with the WebAssembly module.
let slice_tag: &[u8] = unsafe { slice::from_raw_parts(tag as *const u8, tag_len as usize) };
let slice_record: &[u8] =
unsafe { slice::from_raw_parts(record as *const u8, record_len as usize) };
let mut vt: Vec<u8> = Vec::new();
vt.write(slice_tag).expect("Unable to write");
let vtag = str::from_utf8(&vt).unwrap();
let v: Value = serde_json::from_slice(slice_record).unwrap();
let dt = Utc.timestamp_opt(time_sec as i64, time_nsec).unwrap();
let time = dt.format("%Y-%m-%dT%H:%M:%S.%9f %z").to_string();
The function takes several parameters: tag
, tag_len
, time_sec
, time_nsec
, record
, and record_len
. These parameters represent the tag, timestamp, and log record information passed from Fluent Bit.
The code then converts the received parameters into Rust slices (&[u8]) to work with the data. It creates a mutable vector (Vec<u8>
) called vt
and writes the tag data into it. The vtag
variable is created by converting the vt
vector into a UTF-8 string.
Next, the code deserializes the record data into a serde_json::
Value object called v.
The incoming structured logs are:
{"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
It also converts the time_sec
and time_nsec
values into a DateTime object using the Utc.timestamp_opt
function.
The code then extracts specific fields from the v object and assigns them to variables. These fields represent various properties of an IIS log entry, such as date
, site name
, computer name
, IP address
, HTTP method,
URI, status codes
, and more.
let input_logs = v["log"].as_str().unwrap();
let mut buf=String::new();
if let Some(el) = LogEntryIIS::parse_log_iis_w3c_parser(input_logs) {
let log_parsered = json!({
"date": el.date_time,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"s_ip": el.s_ip,
"cs_method": el.cs_method,
"cs_uri_stem": el.cs_uri_stem,
"cs_uri_query": el.cs_uri_query,
"s_port": el.s_port,
"c_ip": el.c_ip,
"cs_user_agent": el.cs_user_agent,
"cs_cookie": el.cs_cookie,
"cs_referer": el.cs_referer,
"cs_host": el.cs_host,
"sc_status": el.sc_status,
"sc_bytes": el.sc_bytes.parse::<i32>().unwrap(),
"cs_bytes": el.cs_bytes.parse::<i32>().unwrap(),
"time_taken": el.time_taken.parse::<i32>().unwrap(),
"c_authorization_header": el.c_authorization_header,
"tag": vtag,
"source": "LogEntryIIS",
"timestamp": format!("{}", time)
});
let message = json!({
"log": log_parsered,
"s_sitename": el.s_sitename,
"s_computername": el.s_computername,
"cs_host": el.cs_host,
"date": el.date_time,
});
buf= message.to_string();
}
buf.as_ptr()
If the log entry can be successfully parsed using the LogEntryIIS::parse_log_iis_w3c_parser function, the code constructs a new JSON object representing the parsed log entry. It includes additional fields like the tag, source, and timestamp. The log entry and some specific fields are also included in a separate JSON object called message.
Finally, the code converts the message object to a string and assigns it to the buf variable. The function returns a pointer to the buf string, which will be used by Fluent Bit.
In summary, this code defines a custom filter plugin for Fluent Bit that processes IIS w3c log records, extracts specific fields, and constructs new JSON objects representing the parsed log entries.
The rest of the code is hosted at https://github.com/kenriortega/flb_filter_iis.git. It is an open-source project and currently provides two functions focused on the current need: parsing and processing a specific format. However, it is subject to new proposals and ideas to grow the project as a suite of possible use cases.
To compile this plugin, we suggest consulting the official Fluent Bit documentation for instructions to perform this process from your local environment and requirements for installing the Rust toolchain Wasm.
$ cargo build --target wasm32-unknown-unknown --release
$ ls target/wasm32-unknown-unknown/release/*.wasm
target/wasm32-unknown-unknown/release/filter_rust.wasm
In case you want to use the plugin from the repository, there is a release section where it is automatically compiled using GitHub actions.
To reproduce the demo, a docker-compose.yaml
file is attached within the repository, displaying the necessary resources for the below steps.
version: '3.8'
volumes:
clickhouse:
services:
clickhouse:
container_name: clickhouse
image: bitnami/clickhouse:latest
environment:
- ALLOW_EMPTY_PASSWORD=no
- CLICKHOUSE_ADMIN_PASSWORD=default
ports:
- 8123:8123
fluent-bit:
image: cr.fluentbit.io/fluent/fluent-bit
container_name: fluent-bit
ports:
- 8888:8888
- 2020:2020
volumes:
- ./docker/conf/fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
- ./target/wasm32-unknown-unknown/release/flb_filter_iis_wasm.wasm:/plugins/flb_filter_iis_wasm.wasm
- ./docker/dataset\:/dataset/
grafana:
image: grafana/grafana:latest
environment:
- GF_PATHS_PROVISIONING=/etc/grafana/provisioning
- GF_AUTH_ANONYMOUS_ENABLED=false
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
depends_on:
- clickhouse
ports:
- "3000:3000"
We next configure Fluent Bit to process the logs collected from IIS. To make this tutorial more practical, we will use the dummy input plugin to generate sample logs. We provide several inputs to simulate the GET, POST, and status code 200, 401, 404, and 500 methods.
[INPUT]
Name dummy
Dummy {"log": "2023-07-20 17:18:54 W3SVC279 WIN-PC1 192.168.1.104 GET /api/Site/site-data qName=quww 13334 10.0.0.0 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/114.0.0.0+Safari/537.36+Edg/114.0.1823.82 _ga=GA2.3.499592451.1685996504;+_gid=GA2.3.1209215542.1689808850;+_ga_PC23235C8Y=GS2.3.1689811012.8.0.1689811012.0.0.0 http://192.168.1.104:13334/swagger/index.html 192.168.1.104:13334 200 456 1082 3131 Bearer+token"}
Tag log.iis.*
[INPUT]
Name dummy
Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 404 142 756 1078 -"}
Tag log.iis.get
[INPUT]
Name dummy
Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 POST / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 200 142 756 1078 -"}
Tag log.iis.post
[INPUT]
Name dummy
Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 POST/ - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 401 142 756 1078 -"}
Tag log.iis.post
[FILTER]
Name wasm
match log.iis.*
WASM_Path /plugins/flb_filter_iis_wasm.wasm
Function_Name flb_filter_log_iis_w3c_custom
accessible_paths .
This Fluent Bit filter configuration specifies the usage of a WebAssembly filter plugin to process log records that match the pattern log.iis.*.
The param Name
with value wasm
specifies the name of the filter plugin, which in this case is “wasm”.
The param WASM_Path specifies the path to the WebAssembly module file that contains the filter plugin implementation.
The param Function_Name: Specifies the name of the function within the WebAssembly module that will be used as a filter implementation.
The stdout output is used to check and visualize in the terminal the output result after filter processing.
[OUTPUT]
name stdout
match log.iis.*
The result is as follows:
2023-10-21 09:36:33 [0] log.iis.post: [[1697906192.407803136, {}], {"cs_host"=>"localhost", "date"=>"2023-08-11 19:56:44", "log"=>{"c_authorization_header"=>"-", "c_ip"=>"::1", "cs_bytes"=>756, "cs_cookie"=>"-", "cs_host"=>"localhost", "cs_method"=>"POST", "cs_referer"=>"-", "cs_uri_query"=>"-", "cs_uri_stem"=>"/", "cs_user_agent"=>"Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200", "date"=>"2023-08-11 19:56:44", "s_computername"=>"WIN-PC1", "s_ip"=>"::1", "s_port"=>"80", "s_sitename"=>"W3SVC1", "sc_bytes"=>142, "sc_status"=>"200", "source"=>"LogEntryIIS", "tag"=>"log.iis.post", "time_taken"=>1078, "timestamp"=>"2023-10-21T16:36:32.407803136 +0000"}, "s_computername"=>"WIN-PC1", "s_sitename"=>"W3SVC1"}]
The output is a log record that has been processed by Fluent Bit with the specified filter configuration. This transformation offers all the advantages of our code implementation, data validation, and type conversion.
# Should be optional.
[OUTPUT]
name http
tls off
match *
host clickhouse
port 8123
URI /?query=INSERT+INTO+fluentbit.iis+FORMAT+JSONEachRow
format json_stream
json_date_key timestamp
json_date_format epoch
http_user default
http_passwd default
To ingest these logs inside ClickHouse, we need to use the http output module. The http output plugin of Fluent Bit allows flushing records into an HTTP endpoint. The plugin issues a POST request with the data records in MessagePack (or JSON). The plugin supports dynamic tags, which allow sending data with different tags through the same input.
Please refer to the official documentation for more information on Fluent Bit’s HTTP output module.
The ClickHouse database must have the following configuration, which was taken from the article Sending Kubernetes logs To ClickHouse with Fluent Bit.
Following the next steps, we can continue with our use case. With the structured logs parsed by our filter, it is possible to perform queries that allow us to analyze the behavior of our websites and APIs hosted on IIS.
First, we need to create the database using your client of choice.
CREATE DATABASE fluentbit
SET allow_experimental_object_type = 1;
CREATE TABLE fluentbit.iis
(
log JSON,
s_sitename String,
s_computername String,
cs_host String,
date Datetime
)
Engine = MergeTree ORDER BY tuple(date,s_sitename,s_computername,cs_host)
TTL date + INTERVAL 3 MONTH DELETE;
This query is written in ClickHouse syntax, and it creates a database named “fluentbit” and a table named “iis” within that database. Let’s break down the query step by step:
CREATE DATABASE fluentbit
: This statement creates a new database named “fluentbit” if it doesn’t already exist.SET allow_experimental_object_type \= 1;
: This command supports experimental object types in ClickHouse. It allows you to use certain experimental features that may not be entirely stable or supported.CREATE TABLE fluentbit.iis
: This statement creates a new “iis” table within the “fluentbit” database. The table will contain the following columns:log
: This column has the data type JSON
, which means it can store data in JSON format.s_sitename
: This column has the data type String
and stores the site name.s_computername
: This column has the data type String
and stores the computer name.cs_host
: This column has the data type String
and stores the host.date
: This column has the data type Datetime
and stores the date and time.Engine \= MergeTree ORDER BY tuple(date, s_sitename, s_computername, cs_host)
: This specifies the storage engine for the “iis” table as MergeTree
. The MergeTree
engine is a popular ClickHouse storage engine that efficiently handles time-series data. The ORDER BY
clause specifies the primary sorting order of the table, which is based on the columns date
, s_sitename
, s_computername
, and cs_host
.TTL date \+ INTERVAL 3 MONTH DELETE;
: This sets a Time-to-Live (TTL) rule on the table. It means that ClickHouse will automatically delete rows from the table where the date
column is older than three months. This process helps to manage the data and keep the table size under control.We can check that our workflow is properly working by checking the data entry:
SET output_format_json_named_tuples_as_objects = 1;
SELECT log FROM fluentbit.iis
LIMIT 1000 FORMAT JSONEachRow;
Now that we have confirmed that ClickHouse is successfully receiving data from Fluent Bit, we can perform queries that provide us with information about the performance and behavior of our sites.
For example, to get the average of time_taken
, sc_bytes
,** cs_bytes
*
SELECT AVG(log.time_taken) FROM fluentbit.iis;
Another example is grouping by IP. This query is an aggregation on the “fluentbit.iis” table:
SET output_format_json_named_tuples_as_objects = 1;
SELECT COUNT(*),c_ip FROM fluentbit.iis
GROUP BY log.c_ip as c_ip;
SELECT COUNT(\*), c_ip
: This part of the query specifies the columns to select in the result. It retrieves two values: the count of rows (COUNT(\*)
) and the value of the c_ip
column.FROM fluentbit.iis
: This indicates the table to select data from. In this case, it selects data from the “iis” table within the “fluentbit” database.GROUP BY log.c_ip as c_ip
: This clause groups the rows based on the values of the log.c_ip
column and assigns an alias c_ip
to the result. The log.c_ip
represents the c_ip
column within the log
JSON field.SELECT count(*)
FROM fluentbit.iis
WHERE log.sc_status LIKE '4%';
These queries calculate the count of rows that meet a specific condition in the “fluentbit.iis” table:
SELECT count(\*)
: This part of the query specifies that we want to calculate the count of rows that match the given condition.FROM fluentbit.iis
: This part indicates the table we want to retrieve the data from. In this case, it is the “iis” table within the “fluentbit” database.WHERE log.sc_status LIKE '4%'
: This clause specifies the condition that must be satisfied for a row to be included in the count. The LIKE
operator with the pattern 4%
matches any string that starts with 4
. This condition will match statuses starting with 4
, which typically represent client errors in HTTP responses (e.g., 400 Bad Request, 404 Not Found).These and many other queries regarding the collected logs can be performed according to our needs.
Now that our records are stored in a database, we can use a visualization tool like Grafana for analysis rather than relying solely on pure SQL.
ClickHouse makes this process easy by offering a plugin for Grafana. The Grafana plugin allows users to connect directly to ClickHouse, enabling them to create interactive dashboards and visually explore their data.
With Grafana’s intuitive interface and powerful visualization capabilities, users can gain valuable insights and make data-driven decisions more effectively. To learn more about connecting Grafana to ClickHouse, you can find detailed documentation and instructions on the official ClickHouse website:Connecting Grafana to ClickHouse.
The Fluent Bit Wasm filter approach provides us with several powerful advantages inherent to programming languages:
sc_bytes
, cs_bytes
, time_taken
. This is particularly useful when we need to validate our data results.To learn more about Fluent Bit and its powerful data processing and routing capabilities, check out Fluent Bit Academy. It’s filled with on-demand videos guiding you through all things Fluent Bit— best practices and how-to’s on advanced processing rules, routing to multiple destinations, and much more. Here’s a sample of what you can find there:
Your destination for best practices and trainings on all things Fluent Bit
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.
Logs are the foundational data of any observability effort. They provide information about every event and error in your applications, making them essential for troubleshooting. Elasticsearch allows us to store, search, and analyze huge volumes of data quickly, making it ideal for the massive volumes of log and other telemetry data generated by modern applications. It is also one of the components of the ELK Stack (Elasticsearch, Logstash, and Kibana), a widely-used log management solution.
Fluent Bit is the leading open source solution for collecting, processing, and routing large volumes of telemetry data, including logs, traces, and metrics. When used as the agent for sending logs to Elasticsearch, you have a highly performative telemetry pipeline.
In this post, we will show you how to send logs to Elasticsearch using Fluent Bit.
This tutorial assumes that you already have Fluent Bit installed and running on your source and that you have Elasticsearch.
For this tutorial, we will run Fluent Bit on an EC2 instance from AWS running Amazon Linux2 and send the logs to Elastic Cloud, Elastic’s hosted service. The configurations you use will vary slightly depending on your source and whether you are using Elastic Cloud or another version of Elasticsearch
Fluent Bit accepts data from a variety of sources using input plugins. The Tail input plugin allows you to read from a text log file as though you were running the tail -f
command
Add the following to your fluent-bit.conf
file.
[INPUT]
Name tail
Path /var/log/*.log
Tag ec2_logs
Depending upon your source, you may need to adjust the Path
parameter to point to your logs. Name
identifies which plugin Fluent Bit should load, and is not customizable by the user. Tag
is optional but can be used for routing and filtering your data (more on that below).
As with inputs, Fluent Bit uses output plugins to send the gathered data to their desired destinations.
To set up your configuration, you will need to gather some information from your Elasticsearch deployment:
Once you have gathered the required information, add the following to your fluent-bit.conf
file below the Input
section.
[OUTPUT]
Name es
Match *
Host https://sample.es.us-central1.gcp.cloud.es.io
Cloud_auth elastic:yRSUzmsEep2DoGIyNT7bFEr4
Cloud_id sample:dXMtY2VudHJhbDEuZ2NwLmNsb3VkLmVzLmlvOjQ0MyQ2MDA4NjljMjA4M2M0ZWM2YWY2MDQ5OWE5Y2Y3Y2I0NCQxZTAyMzcxYzAwODg0NDJjYWI0NzIzNDA2YzYzM2ZkYw==
Port 9243
tls On
tls.verify Off
[OUTPUT]
# optional: send the data to standard output for debugging
name stdout
match *
Be sure to update the values to match your own Elastic account details.
The host is your Elasticsearch endpoint. Cloud_Auth
corresponds to your authentication credentials and must be presented as user:password.
The Match *
parameter indicates that all of the data gathered by Fluent Bit will be forwarded to Elasticsearch. We could also match based on a tag defined in the input plugin. tls On
ensures that the connection between Fluent Bit and the Elasticsearch cluster is secure. By default, the Port
is configured to 9200, so we need to change that to 9243, which is the port used by Elastic Cloud
We have also defined a secondary output that sends all the data to stdout
. This is not required for the Elasticsearch configuration but can be incredibly helpful if we need to debug our configuration.
Once you have saved the changes to your fluent-bit.conf
file, you’ll need to restart Fluent Bit to allow the new configuration to take effect:
sudo systemctl restart fluent-bit
Note: If Fluent Bit is configured to utilize its optional Hot Reload feature, you do not have to restart the service.
Check to make sure Fluent Bit restarted correctly.
systemctl status fluent-bit
Again, these commands may differ depending on your system.
Your logs should now be flowing into Elasticsearch, and you should be able to search your data.
We’ve just seen a basic configuration for getting log data from an AWS EC2 instance into Elasticsearch in Elastic Cloud. The Fluent Bit Elasticsearch output plugin supports many additional parameters that enable you to fine-tune your Fluent Bit to Elasticsearch pipeline, including options for using Amazon Open Search. Check out the Fluent Bit documentation for more.
Fluent Bit also allows you to process the data before routing it to its final destination. You can, for example:
Routing is particularly powerful as it allows you to redirect non-essential data to cheaper storage (or even drop it entirely), potentially saving you thousands of dollars when using costly storage and analysis applications priced by consumption.
You may be asking yourself why you should use Fluent Bit rather than Elastic Agent. It’s a fair question.
Fluent Bit is vendor-neutral. Fluent Bit doesn’t care what backend you are using. It can send data to all of the major backends, including Elasticsearch, Chronosphere Observability Platform, Splunk, Datadog, and more. This helps you to avoid costly vendor lock-in. Transitioning to a new backend is a simple configuration change—no new vendor-specific agent to install across your entire infrastructure.
Fluent Bit is lightweight. It was created to be a lightweight, highly performant alternative to Fluentd designed for containerized and IOT deployments. Although its footprint is only ~ 450kb, it certainly punches above its weight class when it comes to processing millions of records daily.
Fluent Bit is open source. Fluent Bit is a graduated Cloud Native Computing Foundation project under the Fluentd umbrella.
Fluent Bit is trusted. Fluent Bit has been downloaded and deployed billions of times. In fact, it is included with major Kubernetes distributions, including Google Kubernetes Engine (GKE), AWS Elastic Kubernetes Service (EKS), and Azure Kubernetes Service (AKS).
As we have seen, Fluent Bit is a powerful component of your telemetry pipeline and is relatively simple to configure manually. However, such manual configuration becomes untenable as your infrastructure scales to dozens, hundreds, or even thousands of sources.
Chronosphere Telemetry Pipeline, from the creators of Fluent Bit and Calyptia, streamlines log collection, aggregation, transformation, and routing from any source to any destination. Telemetry Pipeline also simplifies fleet operations by automating and centralizing the installation, configuration, and maintenance of Fluent Bit agents across thousands of machines.
This allows companies who are dealing with high costs and complexity the ability to control their data and scale their growing business needs.
Read more about Telemetry Pipeline
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.
In software development, observability allows us to understand a system from the outside by asking questions about the system without knowing its inner workings. Furthermore, it allows us to troubleshoot quickly and helps answer the question, “Why is this happening?”
For us to ask (and answer) those questions, the application must be instrumented. That is, the application code must emit signals such as traces, metrics, and logs, which will contain the answers we seek. In this post, we will focus specifically on traces.
Distributed tracing involves tracking the flow of requests through a distributed system and collecting telemetry data such as traces and spans to monitor the system’s performance and behavior. Distributed tracing helps identify performance bottlenecks, optimize resource utilization, and troubleshoot issues in distributed systems.
Many platforms are used to monitor and analyze trace data and help engineers spot problems, including Chronosphere, Datadog, and the open-source Jaeger. Today, though, we will be using AWS X-Ray, a less commonly used platform but a convenient one for demonstration purposes since so many developers have AWS accounts.
To collect and route the traces to X-Ray we’ll be using Fluent Bit, a widely used open-source data collection agent, processor, and forwarder. Fluent Bit is most commonly used for logging, but it is also capable of handling traces and metrics, making it an ideal single-agent choice for any type of telemetry data.
In this post, we’ll guide you through the process of sending distributed traces to AWS X-Ray using Fluent Bit.
Instrumented applications emit trace data that is collected and processed by a centralized agent, which then sends the data to a backend for storage and analysis
In a microservices architecture, applications are instrumented using specific libraries to send trace data in a particular format supported by the storage engine.
OpenTelemetry (OTel) has become the standard format for working with telemetry data. Its open-source observability framework provides a standardized way to collect and transmit telemetry data such as traces, logs, and metrics from applications.OTel provides a common set of APIs, libraries, and tools for collecting and analyzing telemetry data in distributed systems.
We will be using a Python (uses Flask framework) application that we’ve instrumented using OpenTelemetry SDKs to generate trace data in OpenTelemetry protocol (OTLP).
We will configure Fluent Bit to receive the emitted trace data using the OpenTelemetry input plugin.
Note: For simplicity and demonstration purposes, we will be using a single service capable of generating a hierarchical distributed trace. But in a practical scenario, there would be multiple services instrumented to generate trace data.
AWS X-Ray accepts trace requests in the form of segment documents, which can be sent using two primary protocols:
Unfortunately, AWS X-Ray utilizes a non-standards-compliant trace ID. Since Fluent Bit does not support the custom X-Ray API format, it cannot send trace data directly to AWS X-Ray. To overcome this, we will be using theAWS Distro for OpenTelemetry (ADOT), which supports OTLP input and can be used with the Fluent BitOpenTelemetry output plugin. ADOT automatically converts the compliant trace ID to the format required by AWS X-Ray.
Our architecture looks like this:
Fluent Bit both receives and submits OTLP but the data must be converted to the bespoke format required by AWS X-Ray
Here’s the Fluent Bit configuration that enables the depicted above:
[SERVICE]
flush 1
log_level info
[INPUT]
name opentelemetry
host 0.0.0.0
port 3000
successful_response_code 200
[OUTPUT]
Name opentelemetry
Match *
Host aws-adot
Port 4318
traces_uri /v1/traces
tls off
tls.verify off
add_label app fluent-bit
Breaking down the configuration above, we define one input section:
name opentelemetry
: Specifies the input plugin to use, which in this case is opentelemetry
. This plugin is designed to receive telemetry data (metrics, logs, and traces) following the OpenTelemetry format.host 0.0.0.0
: This binds the input listener to all available IP addresses on the machine, making it accessible from other machines.port 3000
: Defines the port on which Fluent Bit will listen for incoming data.successful_response_code 200
: This is the HTTP response code that Fluent Bit will send back to the sender to indicate that the data was received successfully. A value of 200
corresponds to HTTP OK, meaning the request has succeeded.Name opentelemetry
: Specifies the output plugin to use. This indicates Fluent Bit will forward the processed data to another service or tool supporting OpenTelemetry data.Match : This pattern matches all incoming data. In Fluent Bit, the Match directive is used to filter which data is sent to a particular output based on the tag associated with the data. The asterisk
is a wildcard that matches all tags.Host aws-adot
: Specifies the destination host to which the data will be forwarded.Port 4318
: Defines the port on the destination host where the OpenTelemetry collector or service is listening.traces_uri /v1/traces
: Sets the specific URI endpoint where trace data should be sent. This is part of the OpenTelemetry specification for sending trace data.tls off
: Indicates that TLS (Transport Layer Security) will not be used for this connection, meaning data will be sent in plaintext.add_label app fluent-bit
: This adds a label to the data being sent out. Labels are key-value pairs. Here, app
is the key, and fluent-bit
is the value.With our INPUT and OUTPUT configuration explained, let’s implement it in practice.
Create a file called fluent-bit.conf
with the following contents:
[SERVICE]
flush 1
log_level info
[INPUT]
name opentelemetry
host 0.0.0.0
port 3000
successful_response_code 200
[OUTPUT]
Name opentelemetry
Match *
Host aws-adot
Port 4318
traces_uri /v1/traces
tls off
tls.verify off
add_label app fluent-bit
Create a file called otel.yaml
with the following contents. Be sure to replace the key value <put-your-aws-region>
with your AWS region.
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
awsxray:
region: <put-your-aws-region>
service:
pipelines:
traces:
receivers:
- otlp
exporters:
- awsxray
This configuration defines how the AWS Distro for OpenTelemetry (ADOT) Collector operates. It specifies the collection (receivers) of telemetry data via OpenTelemetry Protocol (OTLP) over gRPC and HTTP, and the export (exporters) of trace data to AWS X-Ray:
0.0.0.0:4317
for incoming gRPC connections.0.0.0.0:4318
for incoming HTTP connections.otlp
receiver to collect data and the awsxray
exporter to send the data to AWS X-Ray.This configuration sets up the ADOT Collector to collect telemetry data using OTLP over both gRPC and HTTP and to export trace data to AWS X-Ray for analysis and visualization.
Create a file called docker-compose.yml
with the following contents and replace these two values, <put-your-aws-access-keys-id>
and <put-your-aws-secret-access-key>,
with your AWS credentials.
version: '3.8'
services:
aws-adot:
image: public.ecr.aws/aws-observability/aws-otel-collector:latest
container_name: aws-adot
ports:
- "4317:4317" # Grpc port
- "4318:4318" # Http port
- "55679:55679"
volumes:
- "./otel.yaml:/otel.yaml"
environment:
- AWS_REGION=ap-south-1
- AWS_ACCESS_KEY_ID=<put-your-aws-access-keys-id>
- AWS_SECRET_ACCESS_KEY=<put-your-aws-secret-access-key>
command: ["--config", "/otel.yaml"]
restart: "no"
fluent-bit:
image: cr.fluentbit.io/fluent/fluent-bit:2.2
container_name: fluent-bit
ports:
- "3000:3000"
volumes:
- "./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf"
restart: "no"
trace-generator:
image: sharadregoti/trace-generator:v0.1.0
container_name: trace-generator
ports:
- "5000:5000"
environment:
- OTEL_HOST_ADDR=fluent-bit:3000
restart: "no"
This docker-compose.yml
file defines a multi-container setup with three services:aws-adot
, fluent-bit
, and trace-generator
.
public.ecr.aws/aws-observability/aws-otel-collector:latest
image.4317
(gRPC), 4318
(HTTP).otel.yaml
configuration file into the container.cr.fluentbit.io/fluent/fluent-bit:2.2
image.3000
for log processing.fluent-bit.conf
configuration file into the container.sharadregoti/trace-generator:v0.1.0
image.5000
.fluent-bit
service as the destination for trace data.docker-compose up
Open a new terminal and execute the below curl request to generate a trace:
curl -X GET http://localhost:5000/generate-hierarchical
or
curl -X GET http://localhost:5000/generate
You will observe a new trace is generated as shown in the below image.
Click on the newly created trace to view the detailed information about the request.
Execute the following to shut everything down:
# Press ctrl + c in the terminal instance where containers are running in foreground
docker-compose down
In this post, we’ve walked through the essentials of setting up distributed tracing with AWS X-Ray and Fluent Bit, demonstrating how to seamlessly integrate trace data collection and forwarding in a microservices environment. By leveraging Docker, AWS X-Ray, and Fluent Bit, developers can achieve a robust observability framework that is both scalable and easy to implement.
To learn more about Fluent Bit, visit the project website or visit Fluent Bit Academy where you will find hours of on-demand training videos covering best practices and how-to’s on advanced processing, routing, and all things Fluent Bit. Here’s a sample of what you can find there:
We also invite you to join the vibrant Fluent community. Visit the project’s GitHub repository to learn how to become a contributor. Or join the Fluent Slack where you will find thousands of fellow Fluent Bit and Fluentd users helping one another with issues and discussing the projects’ roadmaps.
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.