Processing log data with Fluent Bit and WebAssembly

Programming code on a computer screen with a WebAssembly (WA) logo overlay in the center
ACF Image Blog

Learn how WASM can be used to extend Fluent Bit’s processing capabilities, enabling users to implement custom logic and functionalities.

Sharad Regoti, with short dark hair and a beard, smiles at the camera while wearing a blue t-shirt.
Sharad Regoti | Guest Author

Sharad Regoti is a CKA & CKS certified software engineer based in Mumbai.

14 MINS READ

Fluent Bit has over 20+ filters that can be used for out-of-the-box data transformations. These built-in filters accommodate a wide number of use cases and make it easy for developers to transform data before routing to backend destinations.

In addition to the built-in filters, Fluent Bit offers developers the ability to write a new C plugin or leverage Lua processing for any custom use cases. While both a C plugin and Lua can help developers to deliver for custom use cases, there is potentially the added friction of learning a new language. Enter the new Fluent Bit plugin for WebAssembly (WASM).

WebAssembly has received a lot of interest over the past few years based on its promise to enable faster code, provide better compatibility across platforms and to increase security. WASM does this by offering the flexibility to write code in the language of your choice (Go, Rust, C++ & other) with the extreme speed and versatility of assembly programming. Fluent Bit’s WebAssembly plugin lets you take advantage of that extreme flexibility with the high-speed and scalable data processing architecture.

In this post we’ll cover how WASM can be used to extend Fluent Bit’s processing capabilities, enabling users to implement custom logic and functionalities. Through the application of WASM, Fluent Bit addresses a wide array of unique and sophisticated use cases.

Enriching log data in-stream, reducing mean time to detect (MTTD)

In enterprise environments, logs often contain critical information that requires immediate action. However, in the new cloud-first and cloud-native world we may have hundreds of applications producing data across a distributed environment, making it difficult to identify where a problem is coming from.

This is where enriching logs can help. A common use case for enrichment is Kubernetes logs. Kubernetes logs can exist across any number of nodes in a cluster and trying to pinpoint an issue can become incredibly difficult if we are running 15 instances of that application in a cluster.

With Fluent Bit, we can enrich log data as it is collected with information such as geography and location. By providing the namespace, pod, and container ID we can better troubleshoot and locate issues.

Fluent Bit also provides filters for enrichment that allow adding/removing fields or modifying static values of a field, such as the Expect, Grep, Record Modifier, and Modify filters. However, these filters only support modifying static values.

There is an increasing number of use cases where you might want to do more than modifying static values, such as the ability to call third party data sources or to add context based on dynamic or new data. This is where WASM comes in. WASM supports retrieving values from APIs and performing calculations on the fly. With WASM, you can write programming statements directly in your processing pipeline.

The Fluent Bit WASM plugin can create a filter that incorporates logical and arithmetic operations, spanning several expressions — providing the capabilities we utilize in conventional programming.

Diagram of process custom filtering rules using WebAssembly: Fluent Bit input via Dummy plugin, filtered through custom WASM plugin, output displayed using STDOUT plugin.

With this use case in mind, let’s jump in by writing a WASM program for Fluent Bit.

Prerequisites

  • Docker: For running Fluent Bit
  • Golang (1.17 / 1.18): WASM plugins will be written using Golang
  • Tinygo (v0.24.0 or later): For building WASM programs
  • Familiarity with Fluent Bit concepts: Such as, inputs, outputs, parsers, and filters

Writing the WASM program

In this use case, we want to enrich the log data using location data already present in the logs. Specifically, the logs generated by the application already include an IpAddr field. Based on this field, we want to compute and add a region field (e.g., America/Asia).

With Fluent Bit there are no additional requirements for executing WASM plugins. We need to write a program in a language that can compile to WASM. In our case we are going to use golang.

package main

import (
    "fmt"
    "net"
    "strings"
    "unsafe"

    "github.com/valyala/fastjson"
)

//export go_filter
func go_filter(tag *uint8, tag_len uint, time_sec uint, time_nsec uint, record *uint8, record_len uint) *uint8 {

    brecord := unsafe.Slice(record, record_len)

    br := string(brecord)

    var p fastjson.Parser
    value, err := p.Parse(br)
    if err != nil {
        fmt.Println(err)
        return nil
    }
    obj, err := value.Object()
    if err != nil {
        fmt.Println(err)
        return nil
    }

    var ar fastjson.Arena

    var ipAddr string

    if obj.Get("ipAddr") != nil {
        ipAddr = obj.Get("ipAddr").String()
    }

    ipAddrTrimmed := strings.Trim(ipAddr, `"`)
    fr, err := getRegion(ipAddrTrimmed)
    if err != nil {
        obj.Set("region", ar.NewString("unknown"))
    } else {
        obj.Set("region", ar.NewString(fr))
    }

    s := obj.String()
    s += string(rune(0)) // Note: explicit null terminator.
    rv := []byte(s)

    return &rv[0]
}

// Continent represents a continent and its IP range
type Continent struct {
    Name  string
    Start uint32
    End   uint32
}

// Function to convert an IPv4 address to a 32-bit integer
func ipToUint32(ip net.IP) uint32 {
    ip = ip.To4()
    return uint32(ip[0])<<24 + uint32(ip[1])<<16 + uint32(ip[2])<<8 + uint32(ip[3])
}

// Function to determine the continent for a given IP address
func getRegion(ipStr string) (string, error) {
    // List of continents
    continents := []Continent{
        {"Africa", 0, 613566758},                  // ~1/7th of the total range
        {"Asia", 613566759, 1227133516},           // ~1/7th of the total range
        {"Europe", 1227133517, 1840700274},        // ~1/7th of the total range
        {"North America", 1840700275, 2454267032}, // ~1/7th of the total range
        {"Australia", 2454267033, 3067833790},     // ~1/7th of the total range
        {"South America", 3067833791, 3681400548}, // ~1/7th of the total range
        {"Antarctica", 3681400549, 4294967295},    // ~1/7th of the total range
    }

    ip := net.ParseIP(ipStr)
    if ip == nil {
        return "", fmt.Errorf("invalid IP address")
    }

    ipInt := ipToUint32(ip)

    for _, continent := range continents {
        if ipInt >= continent.Start && ipInt <= continent.End {
            return continent.Name, nil
        }
    }
    return "", fmt.Errorf("IP address not in any continent range")
}

func main() {}

Program Explanation

  1. The core logic is written in the function go_filter. This function name will also be used during WASM plugin configuration.
  2. It is mandatory for the WASM plugin to have the below function signature.
//export go_filter
func go_filter(tag *uint8, tag_len uint, time_sec uint, time_nsec uint, record *uint8, record_len uint) *uint8

Note: The comment //export go_filter on function is required and it should be the same as the function name.

  1. Using the function parameters we will have access to the original log record, tag & timestamp. Here is an example log record
{
    "log": "2023-10-02T06:52:52.843524746Z stdout F 122.30.117.241 - - [02/Oct/2023:06:52:23 +0000] GET /vortals HTTP/1.0 204 12615",
    "ipAddr": "3.7.65.195"
}
  1. Processing the Record:
    • The function parameter record is of type byte slice, which presumably contains a JSON string, is converted to a Go string.
    • This string is then parsed using the fastjson package.
  2. Determining region:
    • The getRegion function takes an IP address as a string, parses it, converts it to an integer, and determines which region the IP address belongs to based on predefined IP ranges. If the IP address is invalid or doesn’t fall within any range, an error is returned.
  3. Modify and Return:
    • The determined region is added to the original JSON. The modified record will look like this
{
    "log": "2023-10-02T06:52:52.843524746Z stdout F 122.30.117.241 - - [02/Oct/2023:06:52:23 +0000] GET /vortals HTTP/1.0 204 12615",
    "ipAddr": "3.7.65.194",
    // 👇 New field added by WASM plugin
    "region": "Asia"
}
  • The function then converts the modified JSON string back to a byte slice and returns a pointer to its first byte.
  • Note that there’s an explicit null terminator added to the end of the string before converting it back to a byte slice. This is necessary for compatibility with whatever system reads this output, perhaps a C/C++ framework.
  1. The main function is empty because the primary function here (go_filter) is meant to be exported and used as a plugin.

For more info on writing WASM plugins, follow the official documentation.

Instructions for Compiling the WASM Program:

  1. Initialize a new Golang project using the below command
mkdir go-filter && go mod init go-filter
  1. Copy the above Golang program in a file called filter.go
  2. With our filter program written, lets compile it using tinygo
# Use the below command for tinygo version >= 0.33.0
tinygo build -target=wasi -o filter.wasm filter.go

# Use the below command for tinygo version < 0.33.0
tinygo build -wasm-abi=generic -target=wasi -o filter.wasm filter.go
  1. It will produce a file called filter.wasm. This compiled WASM file will be used by Fluent Bit to execute the plugin.

Configuring Fluent Bit to use WASM plugin

Here’s the Fluent Bit configuration for our use case:

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"41.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"114.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"185.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"3.7.65.196","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"196.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[FILTER]
    Name             wasm
    match            dummy
    WASM_Path        /fluent-bit/etc/filter.wasm
    Function_Name    go_filter

[OUTPUT]
    name             stdout
    match            dummy

Breaking down the configuration above:

Input Section

Each [INPUT] section specifies the configuration for an input plugin. In this configuration, there are five dummy input plugins, each generating log data with different IP addresses.

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"41.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy
  • Name: Specifies the input plugin name (dummy).
  • Dummy: Provides the dummy data to be generated by the plugin.
  • Tag: Assigns a tag (dummy) to this input, which is used to match this input to filters and outputs.

For more information about the dummy plugin, check the official documentation.

Filter Section

The [FILTER] section specifies the configuration for a filter plugin. In this configuration, there is a single filter using the WASM plugin. This plugin selects all the logs that match the tag dummy and applies custom processing rules as discussed above.

[FILTER]
    Name             wasm
    match            dummy
    WASM_Path        /fluent-bit/etc/filter.wasm
    Function_Name    go_filter
  • Name: Specifies the filter plugin name (wasm).
  • match: Matches the tag (dummy) to apply this filter to the input data tagged with dummy.
  • WASM_Path: Path to the WebAssembly (Wasm) file that contains the filter logic.
  • Function_Name: The function within the WASM module to be executed (go_filter).

For more information about the WASM plugin, check the official documentation.

Output Section

The [OUTPUT] section specifies the configuration for an output plugin. In this configuration, there is a single output plugin which redirects logs to Fluent Bit’s stdout.

Note: This is done for demonstration purposes only—in a practical scenario we would have sent it to S3, Elasticsearch, or some other destination.

[OUTPUT]
    name             stdout
    match            dummy
  • name: Specifies the output plugin name (stdout).
  • match: Matches the tag (dummy) to send the filtered input data tagged with dummy to stdout.

For more information about the stdout plugin, check the official documentation.

Instructions For Configuring Fluent Bit:

  1. Creating Configuration FileCreate a file called fluent-bit.conf with the below content
[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"41.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"114.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"185.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"3.7.65.196","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[INPUT]
    Name             dummy
    Dummy            {"ipAddr":"196.0.0.1","log": "2023-08-11 19:56:44 W3SVC1 WIN-PC1 ::1 GET / - 80 ::1 Mozilla/5.0+(Windows+NT+10.0;+Win64;+x64)+AppleWebKit/537.36+(KHTML,+like+Gecko)+Chrome/115.0.0.0+Safari/537.36+Edg/115.0.1901.200 - - localhost 304 142 756 1078 -"}
    Tag              dummy

[FILTER]
    Name             wasm
    match            dummy
    WASM_Path        /fluent-bit/etc/filter.wasm
    Function_Name    go_filter

[OUTPUT]
    name             stdout
    match            dummy
  1. Override Default Configuration

Execute the below command to run Fluent Bit in a docker container, ensure filter.wasm binary exists in the current directory where you are running the command.

docker run \\
  -v $(pwd)/filter.wasm:/fluent-bit/etc/filter.wasm \\
  -v $(pwd)/fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf \\
  -ti cr.fluentbit.io/fluent/fluent-bit:3.1 \\
  /fluent-bit/bin/fluent-bit \\
  -c /fluent-bit/etc/fluent-bit.conf
  1. Verify Fluent Bit Logs

Fluent Bit container will emit logs, you should be able to view the modified region as shown in the below image.

Screen displaying Fluent Bit log data with timestamps, log levels, process details, and added "region" fields via WebAssembly plugin.

To programmatically verify plugin results you can use the Expect filter, read more at Validating Your Data and Structure.

Conclusion

In this post, we examined how to use Fluent Bit and WASM to enrich log data based on pre-existing location data in the logs.

The WASM plugin is just one option for processing data with Fluent Bit. If you are interested in exploring more about Fluent Bit’s ability to process and transform streaming data we recommend the following:

  • Fluent Bit: Advanced Processing” — This on-demand webinar provides an introduction to processing with Fluent Bit and demonstrates best practices and real-world examples for redaction, reduction, enrichment, and tagging of log data.
  • Creating custom processing rules for Fluent Bit with Lua” — In addition to support for WASM, Fluent Bit also supports custom scripts written in Lua. This step-by-step tutorial walks you through several examples.

To learn more, check out The Fluent Bit Academy, your destination for best practices and how-tos on all things Fluent Bit.

Whether you are a seasoned professional or just getting started in open source, there something for you to learn.

WebAssembly (WASM) FAQs

What is Fluent Bit?

Fluent Bit is a super fast, lightweight, and scalable telemetry data agent and processor for logs, metrics, and traces. It is the industry standard for Kubernetes and major cloud providers, including Google, Amazon, Oracle, IBM, and Microsoft.

What is WebAssembly (WASM)?

WebAssembly enables faster code, provide better compatibility across platforms and to increase security. WASM does this by offering the flexibility to write code in the language of your choice (Go, Rust, C++ & other) with the extreme speed and versatility of assembly programming.

What is the Fluent Bit WASM plugin?

The Fluent Bit WASM plugin can create a filter that incorporates logical and arithmetic operations, spanning several expressions — providing the capabilities we utilize in conventional programming.

How does Fluent Bit enrich log data?

Fluent Bit enriches log data as it is collected with information such as geography and location. Fluent Bit improves the ability to troubleshoot and locate issues by providing the namespace, pod, and container ID.

Share This: