Learn how to use open standards and open source tools to collect telemetry data from any source and route it to any destination.
Sudhanshu Prajapati is a developer advocate and software developer with an interest in observability.
On: Apr 5, 2023
An e-commerce company planned a flash sale, and the sudden spike in traffic led to website performance issues. The engineering team couldn’t pinpoint the actual root cause and realized that they needed a better way to monitor their applications. After some research, the team decided to implement an observability pipeline.
An observability pipeline, also called a telemetry pipeline, is a system that collects, processes, and analyzes data from various sources, including logs, metrics, and traces, to provide insights into a distributed system’s performance and behavior.
With an observability pipeline in place, the company could monitor its applications in real time, detect anomalies, and troubleshoot issues faster. Further, by integrating the pipeline with its incident management system, the company would improve its systems’ reliability and availability.
While it looked good on paper, the team had multiple questions.
These are some common questions that come up while implementing an observability pipeline.
Multiple observability tools and stacks are available. Like everything else, there’s no one-size-fits-all solution. In this post, we’ll discuss implementing an agile observability pipeline using OpenTelemetry and Fluent Bit.
OpenTelemetry is an open-source observability framework that provides a standardized way to collect and transmit telemetry data, such as traces, logs, and metrics, from applications and infrastructure. This makes it easier to monitor and troubleshoot your systems and gain insights into their behavior.
The OpenTelemetry project was formed by merging the OpenCensus and OpenTracing projects. It provides a common set of APIs, libraries, and tools for collecting and analyzing telemetry data in distributed systems.
OpenTelemetry has multiple components and the major ones are
For more information about OpenTelemetry components, check out the docs.
Components of our applications and infrastructure generate different types of logs, including application logs and system logs. Since multiple data sources generate tons of data, any observability pipeline solution must be capable of dealing with different data sources and formats, provide flexible routing, and be reliable and secure. That’s where Fluent Bit comes in.
Fluent Bit is an open-source, lightweight, and vendor-neutral telemetry pipeline agent for logs, metrics, and traces. It is generally used to collect, process, and route data to backends like Elasticsearch, Kafka, or other systems. It can work with logs, metrics, traces, and any other form of input data. Being lightweight, one can run it on edge devices, embedded devices, and on cloud services as well.
Fluent Bit has multiple components including
Read more about the components of Fluent Bit.
Fluent Bit can help enable observability pipelines by providing a simple and efficient way to collect and forward telemetry data. Fluent Bit can be configured to collect data from various sources, including logs, metrics, and traces, and can forward data to various backend systems, allowing developers and operators to gain a comprehensive insight into the behavior and performance of complex systems.
Some of the key features of Fluent Bit that make it a popular tool for enabling an Observability Pipeline are:
Let us see how we can build an observability pipeline using OpenTelemetry and Fluent Bit for monitoring and analyzing a microservices-based application. Here, the application is instrumented to send metrics, traces, and logs to Fluent Bit, which will be forwarded to Otel Collector and finally to Jaeger and Prometheus.
For this demo, you will need to have Docker and Docker Compose installed. If you don’t have it already installed, you can follow the install docker-compose official documentation, which has very well-articulated steps. To follow these configuration steps, you can clone the already configured repository available here.
Fluent Bit uses a configuration file to specify its inputs, filters, and outputs. In this case, we will use the tail input plugin to collect logs from a file and the OpenTelemetry output plugin to forward the logs to the OpenTelemetry collector.
For Traces and Metrics
Similarly, we will use the HTTP method where our open telemetry plugin will be listening for metrics and traces at /v1/metrics and /v1/traces, respectively.
Here’s an example configuration file for Fluent Bit:
[SERVICE]
flush 1
log_level info
[INPUT]
Name tail
Path /var/log.log
Tag demo-app
[FILTER]
Name record_modifier
Match demo-app
Record hostname ${HOSTNAME}
[INPUT]
name opentelemetry
host 0.0.0.0
port 3000
successful_response_code 200
[OUTPUT]
Name stdout
Match *
[OUTPUT]
Name opentelemetry
Match *
Host collector
Port 3030
metrics_uri /v1/metrics
logs_uri /v1/logs
traces_uri /v1/traces
Log_response_payload True
tls off
tls.verify off
# add user-defined labels
add_label app fluent-bit
add_label color blue
This configuration file specifies that Fluent Bit should read logs from the path defined, tag them as demo-app and forward them to OpenTelemetry Collector at http://collector:3030/v1/logs
.
The process is similar for traces and metrics; instead of forwarding them to the logs endpoint, it forwards to /v1/traces
and /v1/metrics
as defined in the output plugin in configuration.
OpenTelemetry Collector is a vendor-agnostic agent that can receive, process, and export telemetry data from a variety of sources. Here’s how to set up OpenTelemetry Collector:
Once you have configured Fluent Bit, you can use it to forward the telemetry data to an OpenTelemetry collector. The OpenTelemetry collector is responsible for collecting and processing telemetry data from different sources and forwarding it to a backend system such as Jaeger or Prometheus. Here is our otel-collector configuration yaml; you can read more about receivers, exporters, and processors here
receivers:
otlp:
protocols:
grpc:
http:
endpoint: "0.0.0.0:3030"
Once you have configured to forward telemetry data to an OpenTelemetry collector, we can export these data and use visualization tools such as Jaeger and Prometheus to view and analyze the data.
Jaeger is a distributed tracing system that can be used to visualize and analyze the performance of microservices-based distributed systems. To visualize logs in Jaeger, you will need to configure the OpenTelemetry collector to forward trace data to Jaeger. You can then use the Jaeger UI to view and analyze the traces.
exporters:
otlp:
# disable tls
endpoint: "jaeger:4317"
tls:
insecure: true
logging:
prometheus:
endpoint: "0.0.0.0:8889"
Prometheus is a time-series database and monitoring system that can be used to visualize and analyze metrics data. To visualize logs in Prometheus, we will need to configure the OpenTelemetry collector to forward metrics data to Prometheus. Lastly, we need to configure Prometheus to scrape data from the exported endpoint.
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: demo
scrape_interval: 5s
static_configs:
- targets: ['collector:8889']
The final OTel collector yaml will look like:
receivers:
otlp:
protocols:
grpc:
http:
endpoint: "0.0.0.0:3030"
exporters:
otlp:
# disable tls
endpoint: "jaeger:4317"
tls:
insecure: true
logging:
prometheus:
endpoint: "0.0.0.0:8889"
service:
pipelines:
logs:
receivers: [otlp]
exporters: [logging]
traces:
receivers: [otlp]
exporters: [logging, otlp]
metrics:
receivers: [otlp]
exporters: [logging, prometheus]
To start the local instances of services, run the following command in the cloned repository.
$ docker-compose up --build
Example output:
(venv) ➜ fluent-bit-otel git:(master) ✗ docker compose up --build
[+] Building 2.1s (12/12) FINISHED
=> [internal] load build definition from dockerfile 0.0s
=> => transferring dockerfile: 279B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.8-slim-buster 2.0s
=> [auth] library/python:pull token for registry-1.docker.io 0.0s
=> [1/6] FROM docker.io/library/python:[email protected]:f2199258d29ec06b8bcd3ddcf93615cdc8210d18a942a56b1a488136074123f3 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 64B 0.0s
=> CACHED [2/6] WORKDIR /app 0.0s
=> CACHED [3/6] COPY requirements.txt . 0.0s
=> CACHED [4/6] RUN pip install -r requirements.txt 0.0s
=> CACHED [5/6] RUN apt-get update && apt-get install -y curl 0.0s
=> CACHED [6/6] COPY app.py . 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:613c0c77e65b1e512ba269a3745072e85e4f55bae7054d34e25d7f549a9b0bf7 0.0s
=> => naming to docker.io/library/fluent-bit-otel-app 0.0s
Use 'docker scan' to run Snyk tests against images to find vulnerabilities and learn how to fix them
[+] Running 6/5
⠿ Network fluent-bit-otel_default Created 0.1s
⠿ Container fluent-bit-otel-prometheus-1 Created 0.0s
⠿ Container fluent-bit-otel-jaeger-1 Created 0.1s
⠿ Container fluent-bit-otel-collector-1 Created 0.0s
⠿ Container fluent-bit-otel-fluentbit-1 Created 0.0s
⠿ Container fluent-bit-otel-app-1 Created 0.0s
To generate traces run the following command in the terminal
curl -X GET http://localhost:5000/generate
In the above screenshot, we can see the logs are generated when we hit the generate traces endpoint, and Fluent Bit collects them and forwards them to the OTel Collector.
OpenTelemetry and Fluent Bit integration can be used in a wide range of use cases, including centralized logging, application performance monitoring, and distributed tracing. Let’s dive deeper into each of these use cases.
Centralized logging involves collecting, aggregating, and analyzing logs from multiple sources in a central location. This allows you to monitor the behavior and performance of your entire system in one place, making it easier to detect and diagnose issues.
By integrating Fluent Bit with OpenTelemetry, you can collect and process logs from multiple sources, and forward them to a centralized logging system such as Elasticsearch or Splunk. This enables you to quickly search and analyze logs from multiple sources, and identify patterns and trends that can help you optimize performance and troubleshoot issues.
Application performance monitoring (APM) involves collecting and analyzing telemetry data such as logs, metrics, and traces to monitor the performance of your applications. APM tools can provide insights into the behavior and performance of your applications, allowing you to identify performance bottlenecks, optimize resource utilization, and troubleshoot issues.
By integrating Fluent Bit with OpenTelemetry, you can collect and process telemetry data from multiple sources, and forward it to an APM tool such as Datadog or New Relic. This enables you to monitor the performance of your applications in real-time and identify and diagnose issues quickly.
Distributed tracing involves tracking the flow of requests through a distributed system, and collecting telemetry data such as traces and spans to monitor the performance and behavior of the system. Distributed tracing can help you identify performance bottlenecks, optimize resource utilization, and troubleshoot issues in distributed systems.
By integrating Fluent Bit with OpenTelemetry, you can collect and process telemetry data such as traces and spans and forward it to a distributed tracing system such as Jaeger or Zipkin. This enables you to track the flow of requests through your distributed system and analyze the performance and behavior of your system in real-time.
In conclusion, the OpenTelemetry observability pipeline using Fluent Bit offers a highly efficient and flexible solution for observability needs. One of its key advantages is multiple plugins available for all possible data sources and destinations without having to install new agents and exporters into a single pipeline, this simplifies the observability architecture.
Moreover, the pipeline is highly customizable, enabling users to easily add new sources and destinations as needed, without requiring a major overhaul of the entire system. This feature can be especially valuable for organizations that are constantly evolving and need to adapt their observability tools accordingly. Additionally, the all-in-one support for traces, logs, and metrics provided by
Fluent Bit simplifies the management of data streams, reducing the complexity of the observability pipeline. Overall, the OpenTelemetry observability pipeline using Fluent Bit is a versatile and efficient solution that can help organizations gain valuable insights into their applications, without being locked into a single vendor or facing difficulties in scaling or complexity.
To learn more about Fluent Bit, check out Fluent Bit Academy, your destination for best practices and how-to’s on advanced processing, routing, and all things Fluent Bit. Here’s a sample of what you can find there:
Observability or telemetry pipelines enable organizations to receive data from multiple sources, enrich, transform, redact, and reduce it before routing it to its intended destinations for storage and analysis. The result is that organizations are able to control their data from collection to backend destination, reducing the complexity of managing multiple pipelines, reducing the volume of data sent to backend systems, and decreasing backend costs.
Discover why Gartner® predicts, “By 2026, 40% of log telemetry will be processed through a telemetry pipeline product, an increase from less than 10% in 2022.”
With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.
Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.
Request a demo for an in depth walk through of the platform!