Administrators using Internet Information Services (IIS) to host websites know that IIS logs can be difficult to search and analyze, especially when they are under pressure to identify the cause of an outage or performance issues. Event Viewer, the Windows default application for searching and analyzing logs is unintuitive for users. Many users prefer other tools, which typically require converting the logs into JSON. Although IIS logs are flat files, with each line containing data about an individual web hit, similar to Apache or Nginx, IIS logs are not as easy to format into JSON as Apache and Nginx logs.

In this post, we’ll demonstrate a better approach to IIS logging. We’ll show you how to configure IIS to enrich the logs with non-standard metadata. We’ll then collect the logs with Fluent Bit where we will use a custom Wasm plugin to transform and enrich the data. Finally, we’ll have Fluent Bit route our data (now formatted as JSON) to ClickHouse for storage where we can then extract it using Grafana for visualization and analysis.

Fluent Bit and Wasm

Fluent Bit is a fast, lightweight, and highly scalable log, metric, and trace processor and forwarder that has been deployed billions of times. It is a Cloud Native Computing Foundation graduated open-source project with an Apache 2.0 license.

Fluent Bit uses a pluggable architecture, enabling new data sources and destinations, processing filters, and other new features to be added with approved plugins. Although there are dozens of supported plugins, there may be times when no out-of-the-box plugin accomplishes the exact task you need.

Thankfully, Fluent Bit lets developers write custom scripts using Lua or WebAssembly for such instances.

WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages, enabling deployment on the web for client and server applications. Developer reference documentation for Wasm can be found on MDN’s WebAssembly pages.

This post covers how Wasm can be used with Fluent Bit to implement custom logic and functionalities.

To achieve the desired outcomes, several tasks need to be addressed. Firstly, data validation should be implemented to ensure the accuracy and integrity of the processed information. Additionally, we should perform type conversion to ensure compatibility and consistency across different data formats.

Moreover, integrating external resources such as APIs or databases can enhance the logs by providing additional relevant information. It is crucial to apply backward compatibility, maintainability, and testability principles to the source code to ensure its longevity and ease of future modifications.

Specifically, we’ll demonstrate how to collect and parse Internet Information Services (IIS) w3c logs (with some custom modifications) and transform the raw string into a structured JSON record.

What you’ll need to get started:

Understanding the use case

Organizations need to collect and parse logs generated by IIS (Internet Information Services). In this particular use case, we will explore the significance of utilizing the Fluent Bit WebAssembly (Wasm) plugin to create custom modifications for logs collected in the w3c format.

By leveraging the Fluent Bit Wasm plugin, organizations can enhance their log processing capabilities by implementing tailored transformations and enrichments specific to their requirements. This ability empowers them to extract valuable insights and gain a deeper understanding of their IIS logs, enabling more effective troubleshooting, monitoring, and analysis of their web server infrastructure.

The following diagram provides an overview of the actions we will take:

A diagram illustrating IIS logging files being processed by Fluent Bit and output to ClickHouse in JSON format for use with Grafana.
We use Fluent Bit to collect IIS logs and then process them with a custom script written in Rust and implemented using the Fluent Bit Wasm plugin. We then route the processed logs to ClickHouse for storage and visualization using Grafana.

This diagram highlights an interesting aspect, namely the introduction of WebAssembly in Fluent Bit. In previous versions of Fluent Bit, the workflow for this use case was relatively straightforward. Log information was extracted using parsers that relied on regular expressions or Lua code.

However, with the introduction of the Wasm plugin, Fluent Bit now offers a more versatile and powerful approach to log extraction and processing. Wasm enables the implementation of custom modifications and transformations, allowing for greater flexibility and efficiency in handling log data. This advancement in Fluent Bit’s capabilities opens up new possibilities for extracting and manipulating log information, ultimately enhancing the overall log processing workflow.

Currently, Fluent Bit offers an ecosystem of plugins, filters, and robust parsers through which you can perform pipelines and routing of different workflows.

It is possible to create parsers using regular expressions and components using programming languages such as CGolang, and Rust using Wasm.

Our use case shows how to use Rust to develop a Wasm plugin.

NoteOne of the reasons for using Rust as a programming language is that I previously developed a PoC project to learn Rust. The idea was to create a Fluent Bit-inspired log collector for IIS files, parse the logs, and send them to various destinations (Kafka, Loki, Postgres). Having that code base turned out to be interesting for combining existing logic with the proposal offered by Fluent Bit to integrate Rust with Wasm into its ecosystem.

Configure the IIS log output standard:

By default, IIS w3c logs include fields that may not always provide relevant information for defining usage metrics and access patterns. Additionally, these logs may not cover custom fields specific to our use case.

One example is the c-authorization-header field, which is essential for our analysis but not included in the default log format. Therefore, it becomes necessary to customize the log configuration to include this field and any other relevant custom fields crucial to our specific requirements.

This customization ensures we can access all the necessary information to accurately define metrics and gain insights into our IIS server’s usage and access patterns.

date time s-sitename s-computername s-ip cs-method cs-uri-stem cs-uri-query s-port c-ip cs(User-Agent) cs(Cookie) cs(Referer) cs-host sc-status sc-bytes cs-bytes time-taken c-authorization-header.

Writing the Wasm program

To get started, we need to create a new project to construct the filter. Following the official documentation, run this command in our terminal:

cargo new flb_filster_plugin –lib

The command cargo new flb_filter_plugin --lib is used in the Rust programming language to create a new project. The “–lib” flag specifies that the project should be created as a library project, which is suitable for developing Fluent Bit filter plugins.

Next, open the Cargo.toml file and add the following section:

[lib]
crate-type = ["cdylib"]

[dependencies]

serde = { version = “1.0.160”, features = [“derive”] } serde_json = “1.0.104” serde_bytes = “0.11” rmp-serde = “1.1” regex = “1.9.2” chrono = “0.4.24” libc = “0.2”

Next, open up src/lib.rs and overwrite it with the following entry point code. We will explain the code in the following section.

#[no_mangle]
pub extern "C" fn flb_filter_log_iis_w3c_custom(
    tag: *const c_char,
    tag_len: u32,
    time_sec: u32,
    time_nsec: u32,
    record: *const c_char,
    record_len: u32,
) -> *const u8 {
    let slice_tag: &[u8] = unsafe { slice::from_raw_parts(tag as *const u8, tag_len as usize) };
    let slice_record: &[u8] =
        unsafe { slice::from_raw_parts(record as *const u8, record_len as usize) };
    let mut vt: Vec<u8> = Vec::new();
    vt.write(slice_tag).expect("Unable to write");
    let vtag = str::from_utf8(&vt).unwrap();
    let v: Value = serde_json::from_slice(slice_record).unwrap();
    let dt = Utc.timestamp_opt(time_sec as i64, time_nsec).unwrap();
    let time = dt.format("%Y-%m-%dT%H:%M:%S.%9f %z").to_string();

    let input_logs = v["log"].as_str().unwrap();
    let mut buf=String::new();
    if let Some(el) = LogEntryIIS::parse_log_iis_w3c_parser(input_logs) {
        let log_parsered = json!({
            "date": el.date_time,
            "s_sitename": el.s_sitename,
            "s_computername": el.s_computername,
            "s_ip": el.s_ip,
            "cs_method": el.cs_method,
            "cs_uri_stem": el.cs_uri_stem,
            "cs_uri_query": el.cs_uri_query,
            "s_port": el.s_port,
            "c_ip": el.c_ip,
            "cs_user_agent": el.cs_user_agent,
            "cs_cookie": el.cs_cookie,
            "cs_referer": el.cs_referer,
            "cs_host": el.cs_host,
            "sc_status": el.sc_status,
            "sc_bytes": el.sc_bytes.parse::<i32>().unwrap(),
            "cs_bytes": el.cs_bytes.parse::<i32>().unwrap(),
            "time_taken": el.time_taken.parse::<i32>().unwrap(),
            "c_authorization_header": el.c_authorization_header,
            "tag": vtag,
            "source": "LogEntryIIS",
            "timestamp": format!("{}", time)
        });

        let message = json!({
            "log": log_parsered,
            "s_sitename": el.s_sitename,
            "s_computername": el.s_computername,
            "cs_host": el.cs_host,
            "date": el.date_time,
        });
        buf= message.to_string();
    } 
    buf.as_ptr()

}

Program explanation

This Rust code defines a function called flb_filter_log_iis_w3c_custom, which is intended to be used as a filter plugin in Fluent Bit with the WebAssembly module.

let slice_tag: &[u8] = unsafe { slice::from_raw_parts(tag as *const u8, tag_len as usize) };
let slice_record: &[u8] =
    unsafe { slice::from_raw_parts(record as *const u8, record_len as usize) };
let mut vt: Vec<u8> = Vec::new();
vt.write(slice_tag).expect("Unable to write");
let vtag = str::from_utf8(&vt).unwrap();
let v: Value = serde_json::from_slice(slice_record).unwrap();
let dt = Utc.timestamp_opt(time_sec as i64, time_nsec).unwrap();
let time = dt.format("%Y-%m-%dT%H:%M:%S.%9f %z").to_string();

The function takes several parameters: tagtag_lentime_sectime_nsecrecord, and record_len. These parameters represent the tag, timestamp, and log record information passed from Fluent Bit.

The code then converts the received parameters into Rust slices (&[u8]) to work with the data. It creates a mutable vector (Vec<u8>) called vt and writes the tag data into it. The vtag variable is created by converting the vt vector into a UTF-8 string.

Next, the code deserializes the record data into a serde_json::Value object called v.

The incoming structured logs are:

{"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}

It also converts the time_sec and time_nsec values into a DateTime object using the Utc.timestamp_opt function.

The code then extracts specific fields from the v object and assigns them to variables. These fields represent various properties of an IIS log entry, such as datesite namecomputer nameIP addressHTTP method, URI, status codes, and more.

let input_logs = v["log"].as_str().unwrap();
    let mut buf=String::new();
    if let Some(el) = LogEntryIIS::parse_log_iis_w3c_parser(input_logs) {
        let log_parsered = json!({
            "date": el.date_time,
            "s_sitename": el.s_sitename,
            "s_computername": el.s_computername,
            "s_ip": el.s_ip,
            "cs_method": el.cs_method,
            "cs_uri_stem": el.cs_uri_stem,
            "cs_uri_query": el.cs_uri_query,
            "s_port": el.s_port,
            "c_ip": el.c_ip,
            "cs_user_agent": el.cs_user_agent,
            "cs_cookie": el.cs_cookie,
            "cs_referer": el.cs_referer,
            "cs_host": el.cs_host,
            "sc_status": el.sc_status,
            "sc_bytes": el.sc_bytes.parse::<i32>().unwrap(),
            "cs_bytes": el.cs_bytes.parse::<i32>().unwrap(),
            "time_taken": el.time_taken.parse::<i32>().unwrap(),
            "c_authorization_header": el.c_authorization_header,
            "tag": vtag,
            "source": "LogEntryIIS",
            "timestamp": format!("{}", time)
        });

        let message = json!({
            "log": log_parsered,
            "s_sitename": el.s_sitename,
            "s_computername": el.s_computername,
            "cs_host": el.cs_host,
            "date": el.date_time,
        });
        buf= message.to_string();
    } 
    buf.as_ptr()

If the log entry can be successfully parsed using the LogEntryIIS::parse_log_iis_w3c_parser function, the code constructs a new JSON object representing the parsed log entry. It includes additional fields like the tag, source, and timestamp. The log entry and some specific fields are also included in a separate JSON object called message.

Finally, the code converts the message object to a string and assigns it to the buf variable. The function returns a pointer to the buf string, which will be used by Fluent Bit.

In summary, this code defines a custom filter plugin for Fluent Bit that processes IIS w3c log records, extracts specific fields, and constructs new JSON objects representing the parsed log entries.

The rest of the code is hosted at https://github.com/kenriortega/flb_filter_iis.git. It is an open-source project and currently provides two functions focused on the current need: parsing and processing a specific format. However, it is subject to new proposals and ideas to grow the project as a suite of possible use cases.

Compiling the Wasm program

To compile this plugin, we suggest consulting the official Fluent Bit documentation for instructions to perform this process from your local environment and requirements for installing the Rust toolchain Wasm.

$ cargo build --target wasm32-unknown-unknown --release
$ ls target/wasm32-unknown-unknown/release/*.wasm
target/wasm32-unknown-unknown/release/filter_rust.wasm

In case you want to use the plugin from the repository, there is a release section where it is automatically compiled using GitHub actions.

Configuring Fluent Bit to use Wasm plugin

To reproduce the demo, a docker-compose.yaml file is attached within the repository, displaying the necessary resources for the below steps.

version: '3.8'

volumes:
  clickhouse:
services:
  clickhouse:
    container_name: clickhouse
    image: bitnami/clickhouse:latest
    environment:
      - ALLOW_EMPTY_PASSWORD=no
      - CLICKHOUSE_ADMIN_PASSWORD=default
    ports:
      - 8123:8123


  fluent-bit:
    image: cr.fluentbit.io/fluent/fluent-bit
    container_name: fluent-bit
    ports:
      - 8888:8888
      - 2020:2020
    volumes:
      - ./docker/conf/fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf
      - ./target/wasm32-unknown-unknown/release/flb_filter_iis_wasm.wasm:/plugins/flb_filter_iis_wasm.wasm
      - ./docker/dataset\:/dataset/


  grafana:
    image: grafana/grafana:latest
    environment:
      - GF_PATHS_PROVISIONING=/etc/grafana/provisioning
      - GF_AUTH_ANONYMOUS_ENABLED=false
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
    depends_on:
      - clickhouse
    ports:
      - "3000:3000"

We next configure Fluent Bit to process the logs collected from IIS. To make this tutorial more practical, we will use the dummy input plugin to generate sample logs. We provide several inputs to simulate the GET, POST, and status code 200, 401, 404, and 500 methods.

[INPUT]
    Name dummy
    Dummy {"log": "2023-07-20 17:18:54 W3SVC279 WIN-PC1 192.168.1.104 GET /api/Site/site-data qName=quww 13334 10.0.0.0 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/114.0.0.0+Safari/537.36+Edg/114.0.1823.82 _ga=GA2.3.499592451.1685996504;+_gid=GA2.3.1209215542.1689808850;+_ga_PC23235C8Y=GS2.3.1689811012.8.0.1689811012.0.0.0 http://192.168.1.104:13334/swagger/index.html 192.168.1.104:13334 200 456 1082 3131 Bearer+token"}
    Tag log.iis.*


[INPUT]
    Name dummy
    Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 404 142 756 1078 -"}
    Tag log.iis.get


[INPUT]
    Name dummy
    Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 POST / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 200 142 756 1078 -"}
    Tag log.iis.post




[INPUT]
    Name dummy
    Dummy {"log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 POST/ - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 401 142 756 1078 -"}
    Tag log.iis.post


[FILTER]
    Name   wasm
    match  log.iis.*
    WASM_Path /plugins/flb_filter_iis_wasm.wasm
    Function_Name flb_filter_log_iis_w3c_custom
    accessible_paths .

This Fluent Bit filter configuration specifies the usage of a WebAssembly filter plugin to process log records that match the pattern log.iis.*.

The param Name with value wasm specifies the name of the filter plugin, which in this case is “wasm”.

The param WASM_Path specifies the path to the WebAssembly module file that contains the filter plugin implementation.

The param Function_Name: Specifies the name of the function within the WebAssembly module that will be used as a filter implementation.

The stdout output is used to check and visualize in the terminal the output result after filter processing.

[OUTPUT]
    name stdout
    match log.iis.*

The result is as follows:

2023-10-21 09:36:33 [0] log.iis.post: [[1697906192.407803136, {}], {"cs_host"=>"localhost", "date"=>"2023-08-11 19:56:44", "log"=>{"c_authorization_header"=>"-", "c_ip"=>"::1", "cs_bytes"=>756, "cs_cookie"=>"-", "cs_host"=>"localhost", "cs_method"=>"POST", "cs_referer"=>"-", "cs_uri_query"=>"-", "cs_uri_stem"=>"/", "cs_user_agent"=>"Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200", "date"=>"2023-08-11 19:56:44", "s_computername"=>"WIN-PC1", "s_ip"=>"::1", "s_port"=>"80", "s_sitename"=>"W3SVC1", "sc_bytes"=>142, "sc_status"=>"200", "source"=>"LogEntryIIS", "tag"=>"log.iis.post", "time_taken"=>1078, "timestamp"=>"2023-10-21T16:36:32.407803136 +0000"}, "s_computername"=>"WIN-PC1", "s_sitename"=>"W3SVC1"}]

The output is a log record that has been processed by Fluent Bit with the specified filter configuration. This transformation offers all the advantages of our code implementation, data validation, and type conversion.

# Should be optional.
[OUTPUT]
    name http
    tls off
    match *
    host clickhouse
    port 8123
    URI /?query=INSERT+INTO+fluentbit.iis+FORMAT+JSONEachRow
    format json_stream
    json_date_key timestamp
    json_date_format epoch
    http_user default
    http_passwd default

To ingest these logs inside ClickHouse, we need to use the http output module. The http output plugin of Fluent Bit allows flushing records into an HTTP endpoint. The plugin issues a POST request with the data records in MessagePack (or JSON). The plugin supports dynamic tags, which allow sending data with different tags through the same input.

Please refer to the official documentation for more information on Fluent Bit’s HTTP output module.

Setting up the database output

The ClickHouse database must have the following configuration, which was taken from the article Sending Kubernetes logs To ClickHouse with Fluent Bit.

Following the next steps, we can continue with our use case. With the structured logs parsed by our filter, it is possible to perform queries that allow us to analyze the behavior of our websites and APIs hosted on IIS.

First, we need to create the database using your client of choice.

CREATE DATABASE fluentbit

SET allow_experimental_object_type = 1;
CREATE TABLE fluentbit.iis
(
    log JSON,
    s_sitename String,
    s_computername String,
    cs_host String,
    date Datetime
)
Engine = MergeTree ORDER BY tuple(date,s_sitename,s_computername,cs_host)
TTL date + INTERVAL 3 MONTH DELETE;

This query is written in ClickHouse syntax, and it creates a database named “fluentbit” and a table named “iis” within that database. Let’s break down the query step by step:

We can check that our workflow is properly working by checking the data entry:

SET output_format_json_named_tuples_as_objects = 1;
SELECT log FROM fluentbit.iis
LIMIT 1000 FORMAT JSONEachRow;

Now that we have confirmed that ClickHouse is successfully receiving data from Fluent Bit, we can perform queries that provide us with information about the performance and behavior of our sites.

For example, to get the average of time_takensc_bytes,** cs_bytes*

SELECT AVG(log.time_taken) FROM fluentbit.iis;

Another example is grouping by IP. This query is an aggregation on the “fluentbit.iis” table:

SET output_format_json_named_tuples_as_objects = 1;
SELECT COUNT(*),c_ip  FROM fluentbit.iis
GROUP BY log.c_ip as c_ip;

Commons queries

SELECT count(*) 
FROM fluentbit.iis
WHERE log.sc_status LIKE  '4%';

These queries calculate the count of rows that meet a specific condition in the “fluentbit.iis” table:

These and many other queries regarding the collected logs can be performed according to our needs.

Visualizing our data with Grafana

Now that our records are stored in a database, we can use a visualization tool like Grafana for analysis rather than relying solely on pure SQL.

ClickHouse makes this process easy by offering a plugin for Grafana. The Grafana plugin allows users to connect directly to ClickHouse, enabling them to create interactive dashboards and visually explore their data.

With Grafana’s intuitive interface and powerful visualization capabilities, users can gain valuable insights and make data-driven decisions more effectively. To learn more about connecting Grafana to ClickHouse, you can find detailed documentation and instructions on the official ClickHouse website:Connecting Grafana to ClickHouse.

Screenshot of a ClickHouse data source plugin for Grafana page, depicting installation options, user interface with data visualizations and metrics, and seamless integration with IIS logging.
Screen capture showing ClickHouse plugin for Grafana

Conclusion

The Fluent Bit Wasm filter approach provides us with several powerful advantages inherent to programming languages:

Improve your skills with Fluent Bit Academy

To learn more about Fluent Bit and its powerful data processing and routing capabilities, check out Fluent Bit Academy. It’s filled with on-demand videos guiding you through all things Fluent Bit— best practices and how-to’s on advanced processing rules, routing to multiple destinations, and much more. Here’s a sample of what you can find there:

Your destination for best practices and trainings on all things Fluent Bit

Visit Fluent Bit Academy

About Fluent Bit and Chronosphere

With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.

Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.

If it’s not in the logs, did it really happen?

Logs are the foundational data of any observability effort. They provide information about every event and error in your applications, making them essential for troubleshooting. Elasticsearch allows us to store, search, and analyze huge volumes of data quickly, making it ideal for the massive volumes of log and other telemetry data generated by modern applications. It is also one of the components of the ELK Stack (Elasticsearch, Logstash, and Kibana), a widely-used log management solution.

Fluent Bit is the leading open source solution for collecting, processing, and routing large volumes of telemetry data, including logs, traces, and metrics. When used as the agent for sending logs to Elasticsearch, you have a highly performative telemetry pipeline.

In this post, we will show you how to send logs to Elasticsearch using Fluent Bit.

Graphic depicting a process flow with three steps: a log file icon, an arrow pointing to a central circle with a bird logo representing Fluent Bit, and another arrow pointing to the Elastic logo, illustrating data transfer from Fluent Bit to Elasticsearch.

Before we get started

This tutorial assumes that you already have Fluent Bit installed and running on your source and that you have Elasticsearch.

For this tutorial, we will run Fluent Bit on an EC2 instance from AWS running Amazon Linux2 and send the logs to Elastic Cloud, Elastic’s hosted service. The configurations you use will vary slightly depending on your source and whether you are using Elastic Cloud or another version of Elasticsearch

Configure Fluent Bit

Input Configuration

Fluent Bit accepts data from a variety of sources using input plugins. The Tail input plugin allows you to read from a text log file as though you were running the tail -f command

Add the following to your fluent-bit.conf file.

[INPUT]
    Name      tail
    Path      /var/log/*.log
    Tag       ec2_logs

Depending upon your source, you may need to adjust the Path parameter to point to your logs. Name identifies which plugin Fluent Bit should load, and is not customizable by the user. Tag is optional but can be used for routing and filtering your data (more on that below).

Output Configuration

As with inputs, Fluent Bit uses output plugins to send the gathered data to their desired destinations.

To set up your configuration, you will need to gather some information from your Elasticsearch deployment:

Screenshot of an Elastic deployment dashboard displaying deployment details like application, name, cloud ID, and endpoints, with options for autoscaling, instance configuration, and various actions—including configuring Fluent Bit to Elasticsearch integration.

Once you have gathered the required information, add the following to your fluent-bit.conf file below the Input section.

[OUTPUT]
    Name  es
    Match *
    Host https://sample.es.us-central1.gcp.cloud.es.io
    Cloud_auth elastic:yRSUzmsEep2DoGIyNT7bFEr4
    Cloud_id sample:dXMtY2VudHJhbDEuZ2NwLmNsb3VkLmVzLmlvOjQ0MyQ2MDA4NjljMjA4M2M0ZWM2YWY2MDQ5OWE5Y2Y3Y2I0NCQxZTAyMzcxYzAwODg0NDJjYWI0NzIzNDA2YzYzM2ZkYw==
    Port 9243
    tls On
    tls.verify Off

[OUTPUT]
    # optional: send the data to standard output for debugging
    name  stdout
    match *

Be sure to update the values to match your own Elastic account details.

The host is your Elasticsearch endpoint. Cloud_Auth corresponds to your authentication credentials and must be presented as user:password.

The Match * parameter indicates that all of the data gathered by Fluent Bit will be forwarded to Elasticsearch. We could also match based on a tag defined in the input plugin. tls On ensures that the connection between Fluent Bit and the Elasticsearch cluster is secure. By default, the Port is configured to 9200, so we need to change that to 9243, which is the port used by Elastic Cloud

We have also defined a secondary output that sends all the data to stdout. This is not required for the Elasticsearch configuration but can be incredibly helpful if we need to debug our configuration.

Start Sending Your Logs!

Once you have saved the changes to your fluent-bit.conf file, you’ll need to restart Fluent Bit to allow the new configuration to take effect:

sudo systemctl restart fluent-bit

Note: If Fluent Bit is configured to utilize its optional Hot Reload feature, you do not have to restart the service.

Check to make sure Fluent Bit restarted correctly.

systemctl status fluent-bit

Again, these commands may differ depending on your system.

Your logs should now be flowing into Elasticsearch, and you should be able to search your data.

Learn what else Fluent Bit can do

We’ve just seen a basic configuration for getting log data from an AWS EC2 instance into Elasticsearch in Elastic Cloud. The Fluent Bit Elasticsearch output plugin supports many additional parameters that enable you to fine-tune your Fluent Bit to Elasticsearch pipeline, including options for using Amazon Open Search. Check out the Fluent Bit documentation for more.

Fluent Bit also allows you to process the data before routing it to its final destination. You can, for example:

Routing is particularly powerful as it allows you to redirect non-essential data to cheaper storage (or even drop it entirely), potentially saving you thousands of dollars when using costly storage and analysis applications priced by consumption.

Why Use Fluent Bit instead of Elastic Agent?

You may be asking yourself why you should use Fluent Bit rather than Elastic Agent. It’s a fair question.

Fluent Bit is vendor-neutral. Fluent Bit doesn’t care what backend you are using. It can send data to all of the major backends, including Elasticsearch, Chronosphere Observability Platform, Splunk, Datadog, and more. This helps you to avoid costly vendor lock-in. Transitioning to a new backend is a simple configuration change—no new vendor-specific agent to install across your entire infrastructure.

Fluent Bit is lightweight. It was created to be a lightweight, highly performant alternative to Fluentd designed for containerized and IOT deployments. Although its footprint is only ~ 450kb, it certainly punches above its weight class when it comes to processing millions of records daily.

Fluent Bit is open source. Fluent Bit is a graduated Cloud Native Computing Foundation project under the Fluentd umbrella.

Fluent Bit is trusted. Fluent Bit has been downloaded and deployed billions of times. In fact, it is included with major Kubernetes distributions, including Google Kubernetes Engine (GKE), AWS Elastic Kubernetes Service (EKS), and Azure Kubernetes Service (AKS).

Control the complexity of your Fluent Bit-based pipelines with Chronosphere

As we have seen, Fluent Bit is a powerful component of your telemetry pipeline and is relatively simple to configure manually. However, such manual configuration becomes untenable as your infrastructure scales to dozens, hundreds, or even thousands of sources.

Chronosphere Telemetry Pipeline, from the creators of Fluent Bit and Calyptia, streamlines log collection, aggregation, transformation, and routing from any source to any destination. Telemetry Pipeline also simplifies fleet operations by automating and centralizing the installation, configuration, and maintenance of Fluent Bit agents across thousands of machines.

This allows companies who are dealing with high costs and complexity the ability to control their data and scale their growing business needs.

Read more about Telemetry Pipeline

About Fluent Bit and Chronosphere

With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.

Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.

Distributed tracing and observability

In software development, observability allows us to understand a system from the outside by asking questions about the system without knowing its inner workings. Furthermore, it allows us to troubleshoot quickly and helps answer the question, “Why is this happening?”

For us to ask (and answer) those questions, the application must be instrumented. That is, the application code must emit signals such as traces, metrics, and logs, which will contain the answers we seek. In this post, we will focus specifically on traces.

Distributed tracing involves tracking the flow of requests through a distributed system and collecting telemetry data such as traces and spans to monitor the system’s performance and behavior. Distributed tracing helps identify performance bottlenecks, optimize resource utilization, and troubleshoot issues in distributed systems.

Many platforms are used to monitor and analyze trace data and help engineers spot problems, including Chronosphere, Datadog, and the open-source Jaeger. Today, though, we will be using AWS X-Ray, a less commonly used platform but a convenient one for demonstration purposes since so many developers have AWS accounts.

To collect and route the traces to X-Ray we’ll be using Fluent Bit, a widely used open-source data collection agent, processor, and forwarder. Fluent Bit is most commonly used for logging, but it is also capable of handling traces and metrics, making it an ideal single-agent choice for any type of telemetry data.

In this post, we’ll guide you through the process of sending distributed traces to AWS X-Ray using Fluent Bit.

Prerequisites

Distributed tracing workflow

Diagram showing data flow: Instrumented Application sends trace data to Centralized Observability Data Shipper, such as Fluent Bit, which then sends data to Distributed Trace Storage Engine like AWS X-Ray. Arrows denote data movement direction for effective distributed tracing.

Instrumented applications emit trace data that is collected and processed by a centralized agent, which then sends the data to a backend for storage and analysis

Generating trace data

In a microservices architecture, applications are instrumented using specific libraries to send trace data in a particular format supported by the storage engine.

OpenTelemetry (OTel) has become the standard format for working with telemetry data. Its open-source observability framework provides a standardized way to collect and transmit telemetry data such as traces, logs, and metrics from applications.OTel provides a common set of APIs, libraries, and tools for collecting and analyzing telemetry data in distributed systems.

We will be using a Python (uses Flask framework) application that we’ve instrumented using OpenTelemetry SDKs to generate trace data in OpenTelemetry protocol (OTLP).

We will configure Fluent Bit to receive the emitted trace data using the OpenTelemetry input plugin.

Note: For simplicity and demonstration purposes, we will be using a single service capable of generating a hierarchical distributed trace. But in a practical scenario, there would be multiple services instrumented to generate trace data.

Storing trace data in AWS X-Ray

AWS X-Ray accepts trace requests in the form of segment documents, which can be sent using two primary protocols:

  1. AWS X-Ray API (HTTP): You can send segment documents directly to the AWS X-Ray API using the PutTraceSegments API. This is done using HTTP/1.1.
  2. Direct UDP: You can send segment documents directly to the AWS X-Ray daemon (runs aside with application) over UDP. The X-Ray daemon buffers segments in a queue and uploads them to X-Ray in batches.

Unfortunately, AWS X-Ray utilizes a non-standards-compliant trace ID. Since Fluent Bit does not support the custom X-Ray API format, it cannot send trace data directly to AWS X-Ray. To overcome this, we will be using theAWS Distro for OpenTelemetry (ADOT), which supports OTLP input and can be used with the Fluent BitOpenTelemetry output plugin. ADOT automatically converts the compliant trace ID to the format required by AWS X-Ray.

Our architecture looks like this:

Diagram explaining how to send trace telemetry data to AWS X-Ray using Fluent Bit, illustrating input and output processes via the OpenTelemetry plugin for efficient distributed tracing, leading to AWS Distro for OpenTelemetry and AWS X-Ray.

Fluent Bit both receives and submits OTLP but the data must be converted to the bespoke format required by AWS X-Ray

Configuring Fluent Bit

Here’s the Fluent Bit configuration that enables the depicted above:

[SERVICE]
    flush 1
    log_level info

[INPUT]
    name opentelemetry
    host 0.0.0.0
    port 3000
    successful_response_code 200
    
[OUTPUT]
    Name                opentelemetry
    Match               *
    Host                aws-adot
    Port                4318
    traces_uri          /v1/traces
    tls                 off
    tls.verify          off
    add_label           app fluent-bit

Breaking down the configuration above, we define one input section:

INPUT section

OUTPUT section

With our INPUT and OUTPUT configuration explained, let’s implement it in practice.

Create Fluent Bit configuration file

Create a file called fluent-bit.conf with the following contents:

[SERVICE]
    flush 1
    log_level info

[INPUT]
    name opentelemetry
    host 0.0.0.0
    port 3000
    successful_response_code 200
    
[OUTPUT]
    Name                opentelemetry
    Match               *
    Host                aws-adot
    Port                4318
    traces_uri          /v1/traces
    tls                 off
    tls.verify          off
    add_label           app fluent-bit

Create OTel configuration

Create a file called otel.yaml with the following contents. Be sure to replace the key value &lt;put-your-aws-region> with your AWS region.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
exporters:
  awsxray:
    region: <put-your-aws-region>
service:
  pipelines:
    traces:
      receivers:
        - otlp
      exporters:
        - awsxray

This configuration defines how the AWS Distro for OpenTelemetry (ADOT) Collector operates. It specifies the collection (receivers) of telemetry data via OpenTelemetry Protocol (OTLP) over gRPC and HTTP, and the export (exporters) of trace data to AWS X-Ray:

This configuration sets up the ADOT Collector to collect telemetry data using OTLP over both gRPC and HTTP and to export trace data to AWS X-Ray for analysis and visualization.

Create Docker Compose configuration

Create a file called docker-compose.yml with the following contents and replace these two values, &lt;put-your-aws-access-keys-id> and &lt;put-your-aws-secret-access-key>, with your AWS credentials.

version: '3.8'
services:
  aws-adot:
    image: public.ecr.aws/aws-observability/aws-otel-collector:latest
    container_name: aws-adot
    ports:
      - "4317:4317" # Grpc port
      - "4318:4318" # Http port
      - "55679:55679"
    volumes:
      - "./otel.yaml:/otel.yaml"
    environment:
      - AWS_REGION=ap-south-1
      - AWS_ACCESS_KEY_ID=<put-your-aws-access-keys-id>
      - AWS_SECRET_ACCESS_KEY=<put-your-aws-secret-access-key>
    command: ["--config", "/otel.yaml"]
    restart: "no"

  fluent-bit:
    image: cr.fluentbit.io/fluent/fluent-bit:2.2
    container_name: fluent-bit
    ports:
      - "3000:3000"
    volumes:
      - "./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf"
    restart: "no"

  trace-generator:
    image: sharadregoti/trace-generator:v0.1.0
    container_name: trace-generator
    ports:
      - "5000:5000"
    environment:
      - OTEL_HOST_ADDR=fluent-bit:3000
    restart: "no"

This docker-compose.yml file defines a multi-container setup with three services:aws-adotfluent-bit, and trace-generator.

Start Docker containers

docker-compose up

Generate traces by hitting the sample app

Open a new terminal and execute the below curl request to generate a trace:

curl -X GET http://localhost:5000/generate-hierarchical
or
curl -X GET http://localhost:5000/generate

Go to the AWS console and Open AWS X-Ray

You will observe a new trace is generated as shown in the below image.

Screenshot of a CloudWatch Trace view showing a newly generated trace with response time distribution data and traces list. The trace query, integrated with AWS X-Ray for distributed tracing, highlights one trace with a 200 response code and minimal duration.

Click on the newly created trace to view the detailed information about the request.

Screenshot of CloudWatch Trace details showing visualized segments timeline for demo-app, parent-segment, and grand-parent-segment with start times and durations. Utilizing AWS X-Ray for distributed tracing, logs for this trace are available.

Clean up

Execute the following to shut everything down:

# Press ctrl + c in the terminal instance where containers are running in foreground
docker-compose down

Conclusion

In this post, we’ve walked through the essentials of setting up distributed tracing with AWS X-Ray and Fluent Bit, demonstrating how to seamlessly integrate trace data collection and forwarding in a microservices environment. By leveraging Docker, AWS X-Ray, and Fluent Bit, developers can achieve a robust observability framework that is both scalable and easy to implement.

Learn more

To learn more about Fluent Bit, visit the project website or visit Fluent Bit Academy where you will find hours of on-demand training videos covering best practices and how-to’s on advanced processing, routing, and all things Fluent Bit. Here’s a sample of what you can find there:

Visit Fluent Bit Academy

We also invite you to join the vibrant Fluent community. Visit the project’s GitHub repository to learn how to become a contributor. Or join the Fluent Slack where you will find thousands of fellow Fluent Bit and Fluentd users helping one another with issues and discussing the projects’ roadmaps.

About Fluent Bit and Chronosphere

With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.

Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.