Creating custom processing rules for Fluent Bit with Lua

Close-up of computer code on a screen with a green logo featuring a white bird in flight overlaid on top, seamlessly integrated with Fluent Bit and Lua scripting capabilities.
ACF Image Blog

What if you need a Fluent Bit filter plugin that doesn’t exist yet? Thankfully, for these situations, the official Lua filter plugin for Fluent Bit allows users to write custom Lua scripts to process streaming data. Learn how in this post.

Erik Bledsoe
Erik Bledsoe | Customer Journey Strategist | Chronosphere

Erik is a Customer Journey Strategist at Chronosphere. His own personal journey has taken him from working in higher education to working in tech startups. Data relaxes him, as do cooking and books. He currently resides in North Carolina, where he recently returned after a long time away. He hopes to prove that fellow North Carolinian Thomas Wolfe was wrong about going home again.

Anurag Gupta
Anurag Gupta | Field Architect | Chronosphere

Anurag is a Field Architect at Chronosphere and is a maintainer of the Fluentd and Fluent Bit project. Previously, he was the co-founder of Calyptia, a telemetry pipeline company that was acquired by Chronosphere. Anurag worked at Elastic, driving cloud products and creating the Elastic Operator product. His experience also includes tenure at Treasure Data heading enterprise open source with Fluentd, and Microsoft Azure Log Analytics, working on Observability as a cloud provider.

13 MINS READ

When no existing plugin does exactly what you need

Fluent Bit enables you to collect logs, metrics, and traces from various sources, filter and transform them, and then forward them to multiple destinations. It utilizes a plugin architecture for creating integrations with data sources and destinations as well as filters for in-stream data processing.

Diagram displaying the Fluent Bit Data Pipeline with components: Input, Parser, Filter, Buffer, Routing, and multiple Output Destinations. An illustration of a hummingbird is above the pipeline. Custom processing rules using Lua can be integrated into the pipeline for enhanced data handling.

Although there are dozens of supported plugins, there may be times when no out-of-the-box plugin will accomplish the exact processing you need. You may need, for example, to apply some complex business logic to the data before routing and storing for analysis. Or you may need to enrich the data with some sort of computation. Thankfully, for these situations, the official Lua filter plugin for Fluent Bit allows users to write custom Lua scripts to process the records flowing through the data pipeline.

In this post, we’ll provide an overview of the Lua filter plugin and how it functions. We’ll also provide some working examples that demonstrate the plugin and, hopefully, inspire you to create your own custom Lua scripts.

What you’ll need to get started

  • Familiarity with Fluent Bit concepts such as inputs, outputs, parsers, and filters. If you’re unfamiliar with these concepts, please refer to the official documentation.
  • A running Fluent Bit instance. We will be using a very basic AWS EC2 running Debian. Check the documentation if you need help installing Fluent Bit for your OS.
  • Lua installed on the same machine where Fluent Bit is running. Lua comes preinstalled on many flavors of Linux. Check the Lua documentation for help with installation.

What is Lua?

Lua is a lightweight, high-level, multi-paradigm scripting language designed primarily for embedded use in applications. It has a Python-like syntax, making it easy for many developers to pick up. It is widely used as an extension library, including in apps such as

  • Roblox
  • World of Warcraft
  • Adobe Photoshop Lightroom
  • Redis

Lua’s high performance, small footprint, and built-in pattern-matching library make it ideally suited for scripting extensions for Fluent Bit plugins. As we will see, the built-in pattern matching enables us to use Lua for powerful parsing and transformation of records. Another benefit of this approach, Lua scripts can be much less resource-intensive than complex regex formulas.

Configuring the Lua filter plugin for Fluent Bit

To invoke the Lua filter plugin, you must define it in your Fluent Bit configuration file. It requires 4 parameters:

  • name — this will always be Lua
  • match — this defines what records should be processed by the filter
  • script or code — these two parameters let Fluent Bit know how to locate the Lua script to be executed. The script parameter identifies the path and filename of an external file containing the script. The code parameter indicates that the Lua script is presented inline in the configuration file as the value of the code parameter.
  • call — the name of a function defined in the script that should be executed. Only one function can be called from the filter configuration, although that function may call other functions in the script. You could also configure multiple Lua plugins in the configuration file, each calling a different script if needed.

The plugin accepts other parameters as well, but these 4 are required.

A sample configuration could look like this:

[FILTER]
    name lua
    match *
    script /path/to/your/script/my-script.lua
    call cb_filter

In the above, the match value of * indicates that this filter should process all records. The script parameter points to the location of the file. Finally, the call parameter identifies that the function to be executed is named cb_filter.

If we wanted to utilize inline scripting rather than an external file, the configuration might look like this:

[FILTER]
    Name    lua
    Match   *
    code    function inline_filter(tag, timestamp, record)record.tag = tag; return 1, timestamp, record end
    call    inline_filter

Understanding the Lua filter plugin

The Lua function takes three arguments, which are automatically supplied by Fluent Bit every time it calls a function on a matching record. The three arguments are:

  • tag: the name of the tag associated with the incoming record
  • timestamp: the timestamp associated with the incoming record, formatted as an epoch timestamp with nanosecond resolution. If the record contains an identifiable timestamp, Fluent Bit will utilize that as the timestamp. If the record does not contain a timestamp, or if Fluent Bit cannot identify the timestamp because the record is unstructured, it will generate a timestamp based on when Fluent Bit received the record.
  • record: the record itself, formatted as a Lua table

The Lua function must then return three arguments:

  • code: the code provides instructions for Fluent Bit about how to process the record being returned. There are four possible values:
Code Description
-1 The record will be dropped from the pipeline; no additional filters will be applied and it will not be routed to any output; this is useful for disposing of unnecessary or noisy data, resulting in storage savings.
0 The record should not be modified by the Lua filter; the original record initially passed to the Lua should continue through the pipeline, including through any additional filters, and be routed to the appropriate endpoint(s) as defined in the configuration file; this is useful when not all data needs to be processed.
1 The original record and its timestamp should be replaced by the timestamp and record values returned by the Lua function; this is useful when strict auditing of all transformations is required.
2 The original record has been changed and should be replaced with the returned record, but the record timestamp should remain the same as originally passed.
  • timestamp: the timestamp that should be applied to the record being returned; this will only take place if the returned code value is 1
  • record: the original record passed to the function as transformed (or not) by the Lua script

As you might imagine, Lua’s flexibility greatly expands the possibilities for processing your streaming data. We could, for example, compare the IP address contained in a record to a list of known IP addresses and then drop or tag the records in a particular manner. This would enable us to drop records when Google bot indexes our website or tag internal traffic for routing to a different destination than external traffic. Or you could identify and mask sensitive data contained within the record.

Now that we have an understanding of how the plugin works, let’s dive in and start creating some Lua script filters.

Example Lua filters

For the purposes of this post, we’ll be using some sample Apache2 access log data as our input. You can download the data here if you would like to use it to follow along with the examples.

Before adding the Lua filter, our Fluent Bit config looks like this:

[SERVICE]
    parsers_file /fluent-bit/etc/parsers.conf

[INPUT]
    name tail
    path /var/log/access.log
    read_from_head true
    parser apache2

[OUTPUT]
    name stdout
    format json
    match *

We are loading the standard Fluent Bit parsers. Then we use our sample log data file as input — beginning at the first line (head) — and run it through the included Apache2 parser, which formats it as JSON and embeds the Fluent Bit timestamp into the record. Finally, we send the output to stdout so that we can see it.

We can then start Fluent Bit with this command:

/fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf | jq

Note that we are piping our output through jq to make it more readable.

The output should look something like this:

…
  {
    "date": 1643771334,
    "host": "182.165.233.130",
    "user": "-",
    "method": "PUT",
    "path": "/explore",
    "code": "200",
    "size": "4953",
    "referer": "https://fields.com/list/wp-content/main/faq/",
    "agent": "Mozilla/5.0 (Windows NT 5.1; fy-DE; rv:1.9.2.20) Gecko/2016-03-08 11:51:37 Firefox/3.8"
  },
  {
    "date": 1643771606,
    "host": "97.155.7.33",
    "user": "-",
    "method": "DELETE",
    "path": "/explore",
    "code": "200",
    "size": "4981",
    "referer": "http://skinner-stanley.info/list/faq.htm",
    "agent": "Mozilla/5.0 (iPad; CPU iPad OS 10_3_4 like Mac OS X) AppleWebKit/533.2 (KHTML, like Gecko) FxiOS/11.8c5049.0 Mobile/32P755 Safari/533.2"
  },
  {
    "date": 1643771902,
    "host": "73.137.57.176",
    "user": "-",
    "method": "GET",
    "path": "/search/tag/list",
    "code": "301",
    "size": "5103",
    "referer": "http://www.warner-kramer.info/",
    "agent": "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 6.2; Trident/3.0)"
  }

Example: Saying hello

Now let’s create our first Lua script. We’ll start by simply enriching the record with a new key-value pair. The Lua script looks like this:

function hi_filter(tag, timestamp, record)
    record.hello = "Hello world"
    return 1, timestamp, record
end

Since our function is short, we will include it inline in our Fluent Bit configuration, which now looks like this:

[SERVICE]
    parsers_file /fluent-bit/etc/parsers.conf

[INPUT]
    name tail
    path /var/log/access.log
    read_from_head true
    parser apache2

[FILTER]
    name lua
    match *
    code function hi_filter(tag, timestamp, record) record.hello = "Hello world"; return 1, timestamp, record end
    call hi_filter

[OUTPUT]
    name stdout
    format json
    match *

We then request that Fluent Bit reprocess our sample data:

/fluent-bit/bin/fluent-bit -c /fluent-bit/etc/fluent-bit.conf | jq

The last few records of our output should look like this:

{
  "date": 1643771606,
  "method": "DELETE",
  "code": "200",
  "path": "/explore",
  "size": "4981",
  "referer": "http://skinner-stanley.info/list/faq.htm",
  "agent": "Mozilla/5.0 (iPad; CPU iPad OS 10_3_4 like Mac OS X) AppleWebKit/533.2 (KHTML, like Gecko) FxiOS/11.8c5049.0 Mobile/32P755 Safari/533.2",
  "user": "-",
  "host": "97.155.7.33",
  "hello": "Hello world"
},
{
  "date": 1643771902,
  "method": "GET",
  "code": "301",
  "path": "/search/tag/list",
  "size": "5103",
  "referer": "http://www.warner-kramer.info/",
  "agent": "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 6.2; Trident/3.0)",
  "user": "-",
  "host": "73.137.57.176",
  "hello": "Hello world"
}

Because our function returned a value of 1 for the code parameter (return 1,timestamp, record) the modifications to the record that our script performed were returned and the modified record would continue down the pipeline. If we had returned a value of -1 all of the records would have been dropped, while a value of 0 would simply have ignored all the changes made.

Example: Enriching data with hostname

Now let’s enrich our data with something a little more useful than “hello world.”

First, replace the Lua filter in our configuration file with the following:

[FILTER]
    Name    lua
    Match   *
    script /fluent-bit/etc/script-example.lua
    call    enrich_filter

Since the Lua script we will be using is a bit more complex than our original script, we will store it in a separate file and call it using the script parameter.

Now create the file at the path above with this content:

local a
local function b()
    if a==nil then
        local c=io.popen('hostname')
        a=c:read('*a'):gsub('%s+$','')c:close()
        end;
        return a
        end;
        function enrich_filter(tag,timestamp,record)
            record.hostname=b()
            return 1,timestamp,record
        end

This Lua script will grab the hostname of our machine and add it to our record, which is much more useful when examining our logs later than just adding a greeting.

It also demonstrates how even though a single Lua filter can only call one function, that function can call additional functions.

When we again run Fluent Bit and process our sample log data, the last few records should look something like the following:

{
  "date": 1643771334,
  "code": "200",
  "hostname": "fluent-bit-sandbox",
  "size": "4953",
  "agent": "Mozilla/5.0 (Windows NT 5.1; fy-DE; rv:1.9.2.20) Gecko/2016-03-08 11:51:37 Firefox/3.8",
  "path": "/explore",
  "host": "182.165.233.130",
  "method": "PUT",
  "user": "-",
  "referer": "https://fields.com/list/wp-content/main/faq/"
},
{
  "date": 1643771606,
  "code": "200",
  "hostname": "fluent-bit-sandbox",
  "size": "4981",
  "agent": "Mozilla/5.0 (iPad; CPU iPad OS 10_3_4 like Mac OS X) AppleWebKit/533.2 (KHTML, like Gecko) FxiOS/11.8c5049.0 Mobile/32P755 Safari/533.2",
  "path": "/explore",
  "host": "97.155.7.33",
  "method": "DELETE",
  "user": "-",
  "referer": "http://skinner-stanley.info/list/faq.htm"
},
{
  "date": 1643771902,
  "code": "301",
  "hostname": "fluent-bit-sandbox",
  "size": "5103",
  "agent": "Mozilla/5.0 (compatible; MSIE 6.0; Windows NT 6.2; Trident/3.0)",
  "path": "/search/tag/list",
  "host": "73.137.57.176",
  "method": "GET",
  "user": "-",
  "referer": "http://www.warner-kramer.info/"
}

Example: Dropping and routing data

In this example, we will use a Lua function to examine the HTTP error codes in our logs. If the code is 200, we will drop the record. For the remaining records, we will add a new key-value pair that varies depending on the code.

Append the following code to our existing script-example.lua file:

function route_filter(a,b,c)
   local d=c.code:find('^200')~=nil;
   if d or e then 
      return-1 
   elseif c.code == "404" then
      c.route="team1"
   elseif c.code == "301" then 
      c.route="team2"
   else
      c.route="team3"
   end;
   return 1,b,c
end

Next, modify our Fluent Bit configuration file to add a second Lua filter that calls our new function:

[SERVICE]
    parsers_file /fluent-bit/etc/parsers.conf

[INPUT]
    name tail
    path /var/log/access.log
    read_from_head true
    parser apache2

[FILTER]
    Name    lua
    Match   *
    script /fluent-bit/etc/script-example.lua
    call    route_filter

[FILTER]
    Name    lua
    Match   *
    script /fluent-bit/etc/script-example.lua
    call    enrich_filter

[OUTPUT]
    name stdout
    format json
    match *

Although each instance of the Lua filter can only call one function, there is no problem with having both filters refer to the same file that contains our scripts. Note that rather than add a second Lua filter to our configuration we could also have modified our script so that the route_filter function called the enrich_filter function as the last step before returning its values.

When we rerun Fluent Bit to process our sample data, we see that all records with code 200 have been dropped, and each remaining record has a new route key with a value of either team1, team2, or team3.

With this logic applied, we can then use a series of rewrite_tag filters to route the data to different destinations. We won’t go through the specifics of how that works in this post, but watch this webinar for an excellent demonstration of the concept.

 

Video Thumbnail

Next steps: Learn more about Fluent Bit

To learn more about Fluent Bit, check out Fluent Bit Academy, your destination for best practices and how-to’s on advanced processing, routing, and all things Fluent Bit. Here’s a sample of what you can find there:

  • Getting Started with Fluent Bit and OpenSearch
  • Getting Started with Fluent Bit and OpenTelemetry
  • Fluent Bit for Windows

 

About Fluent Bit and Chronosphere

With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.

Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.

Share This: