A guide to distributed tracing with AWS X-Ray and Fluent Bit

Stylized cloud icon with vertical bars on a circular, gray gradient background, featuring blue and white lines resembling circuitry to the left, reminiscent of AWS X-Ray visualizations.
ACF Image Blog

While most commonly used for logging, Fluent Bit is also capable of handling traces. Learn how Fluent Bit collects and sends OTel-compliant tracing data to AWS X-Ray.

Sharad Regoti, with short dark hair and a beard, smiles at the camera while wearing a blue t-shirt.
Sharad Regoti | Guest Author

Sharad Regoti is a CKA & CKS certified software engineer based in Mumbai.

11 MINS READ

Distributed tracing and observability

In software development, observability allows us to understand a system from the outside by asking questions about the system without knowing its inner workings. Furthermore, it allows us to troubleshoot quickly and helps answer the question, “Why is this happening?”

For us to ask (and answer) those questions, the application must be instrumented. That is, the application code must emit signals such as traces, metrics, and logs, which will contain the answers we seek. In this post, we will focus specifically on traces.

Distributed tracing involves tracking the flow of requests through a distributed system and collecting telemetry data such as traces and spans to monitor the system’s performance and behavior. Distributed tracing helps identify performance bottlenecks, optimize resource utilization, and troubleshoot issues in distributed systems.

Many platforms are used to monitor and analyze trace data and help engineers spot problems, including Chronosphere, Datadog, and the open-source Jaeger. Today, though, we will be using AWS X-Ray, a less commonly used platform but a convenient one for demonstration purposes since so many developers have AWS accounts.

To collect and route the traces to X-Ray we’ll be using Fluent Bit, a widely used open-source data collection agent, processor, and forwarder. Fluent Bit is most commonly used for logging, but it is also capable of handling traces and metrics, making it an ideal single-agent choice for any type of telemetry data.

In this post, we’ll guide you through the process of sending distributed traces to AWS X-Ray using Fluent Bit.

Prerequisites

  • Docker and Docker Compose: Installed on your local machine.
  • An AWS account
  • AWS CLI is a tool to manage AWS services. Install the AWS CLI by following the officialAWS CLI installation guide. After installation, configure the AWS CLI with your credentials and default region by running AWS configure. For detailed instructions, refer to theAWS CLI configuration guide
  • Familiarity with Fluent Bit concepts such as inputs, outputs, parsers, and filters. If you’re not familiar with these concepts, please refer to the official documentation.

Distributed tracing workflow

Diagram showing data flow: Instrumented Application sends trace data to Centralized Observability Data Shipper, such as Fluent Bit, which then sends data to Distributed Trace Storage Engine like AWS X-Ray. Arrows denote data movement direction for effective distributed tracing.

Instrumented applications emit trace data that is collected and processed by a centralized agent, which then sends the data to a backend for storage and analysis

Generating trace data

In a microservices architecture, applications are instrumented using specific libraries to send trace data in a particular format supported by the storage engine.

OpenTelemetry (OTel) has become the standard format for working with telemetry data. Its open-source observability framework provides a standardized way to collect and transmit telemetry data such as traces, logs, and metrics from applications.OTel provides a common set of APIs, libraries, and tools for collecting and analyzing telemetry data in distributed systems.

We will be using a Python (uses Flask framework) application that we’ve instrumented using OpenTelemetry SDKs to generate trace data in OpenTelemetry protocol (OTLP).

We will configure Fluent Bit to receive the emitted trace data using the OpenTelemetry input plugin.

Note: For simplicity and demonstration purposes, we will be using a single service capable of generating a hierarchical distributed trace. But in a practical scenario, there would be multiple services instrumented to generate trace data.

Storing trace data in AWS X-Ray

AWS X-Ray accepts trace requests in the form of segment documents, which can be sent using two primary protocols:

  1. AWS X-Ray API (HTTP): You can send segment documents directly to the AWS X-Ray API using the PutTraceSegments API. This is done using HTTP/1.1.
  2. Direct UDP: You can send segment documents directly to the AWS X-Ray daemon (runs aside with application) over UDP. The X-Ray daemon buffers segments in a queue and uploads them to X-Ray in batches.

Unfortunately, AWS X-Ray utilizes a non-standards-compliant trace ID. Since Fluent Bit does not support the custom X-Ray API format, it cannot send trace data directly to AWS X-Ray. To overcome this, we will be using theAWS Distro for OpenTelemetry (ADOT), which supports OTLP input and can be used with the Fluent BitOpenTelemetry output plugin. ADOT automatically converts the compliant trace ID to the format required by AWS X-Ray.

Our architecture looks like this:

Diagram explaining how to send trace telemetry data to AWS X-Ray using Fluent Bit, illustrating input and output processes via the OpenTelemetry plugin for efficient distributed tracing, leading to AWS Distro for OpenTelemetry and AWS X-Ray.

Fluent Bit both receives and submits OTLP but the data must be converted to the bespoke format required by AWS X-Ray

Configuring Fluent Bit

Here’s the Fluent Bit configuration that enables the depicted above:

[SERVICE]
    flush 1
    log_level info

[INPUT]
    name opentelemetry
    host 0.0.0.0
    port 3000
    successful_response_code 200
    
[OUTPUT]
    Name                opentelemetry
    Match               *
    Host                aws-adot
    Port                4318
    traces_uri          /v1/traces
    tls                 off
    tls.verify          off
    add_label           app fluent-bit

Breaking down the configuration above, we define one input section:

INPUT section

  • name opentelemetry: Specifies the input plugin to use, which in this case is opentelemetry. This plugin is designed to receive telemetry data (metrics, logs, and traces) following the OpenTelemetry format.
  • host 0.0.0.0: This binds the input listener to all available IP addresses on the machine, making it accessible from other machines.
  • port 3000: Defines the port on which Fluent Bit will listen for incoming data.
  • successful_response_code 200: This is the HTTP response code that Fluent Bit will send back to the sender to indicate that the data was received successfully. A value of 200 corresponds to HTTP OK, meaning the request has succeeded.

OUTPUT section

  • Name opentelemetry: Specifies the output plugin to use. This indicates Fluent Bit will forward the processed data to another service or tool supporting OpenTelemetry data.
  • Match : This pattern matches all incoming data. In Fluent Bit, the Match directive is used to filter which data is sent to a particular output based on the tag associated with the data. The asterisk is a wildcard that matches all tags.
  • Host aws-adot: Specifies the destination host to which the data will be forwarded.
  • Port 4318: Defines the port on the destination host where the OpenTelemetry collector or service is listening.
  • traces_uri /v1/traces: Sets the specific URI endpoint where trace data should be sent. This is part of the OpenTelemetry specification for sending trace data.
  • tls off: Indicates that TLS (Transport Layer Security) will not be used for this connection, meaning data will be sent in plaintext.
  • add_label app fluent-bit: This adds a label to the data being sent out. Labels are key-value pairs. Here, app is the key, and fluent-bit is the value.

With our INPUT and OUTPUT configuration explained, let’s implement it in practice.

Create Fluent Bit configuration file

Create a file called fluent-bit.conf with the following contents:

[SERVICE]
    flush 1
    log_level info

[INPUT]
    name opentelemetry
    host 0.0.0.0
    port 3000
    successful_response_code 200
    
[OUTPUT]
    Name                opentelemetry
    Match               *
    Host                aws-adot
    Port                4318
    traces_uri          /v1/traces
    tls                 off
    tls.verify          off
    add_label           app fluent-bit

Create OTel configuration

Create a file called otel.yaml with the following contents. Be sure to replace the key value <put-your-aws-region> with your AWS region.

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
exporters:
  awsxray:
    region: <put-your-aws-region>
service:
  pipelines:
    traces:
      receivers:
        - otlp
      exporters:
        - awsxray

This configuration defines how the AWS Distro for OpenTelemetry (ADOT) Collector operates. It specifies the collection (receivers) of telemetry data via OpenTelemetry Protocol (OTLP) over gRPC and HTTP, and the export (exporters) of trace data to AWS X-Ray:

  • Receivers: Configures the ADOT Collector to receive telemetry data.
    • OTLP Receiver: Accepts data over two protocols:
      • gRPC: Listens on 0.0.0.0:4317 for incoming gRPC connections.
      • HTTP: Listens on 0.0.0.0:4318 for incoming HTTP connections.
  • Exporters: Defines how and where processed data is sent.
    • AWS X-Ray Exporter: Configured to send trace data to the AWS X-Ray service in the specified AWS region.
  • Service:
    • Pipelines: Organizes the flow of data from receivers to exporters.
      • Traces Pipeline: Specific for trace data, it uses the otlp receiver to collect data and the awsxray exporter to send the data to AWS X-Ray.

This configuration sets up the ADOT Collector to collect telemetry data using OTLP over both gRPC and HTTP and to export trace data to AWS X-Ray for analysis and visualization.

Create Docker Compose configuration

Create a file called docker-compose.yml with the following contents and replace these two values, &lt;put-your-aws-access-keys-id> and &lt;put-your-aws-secret-access-key>, with your AWS credentials.

version: '3.8'
services:
  aws-adot:
    image: public.ecr.aws/aws-observability/aws-otel-collector:latest
    container_name: aws-adot
    ports:
      - "4317:4317" # Grpc port
      - "4318:4318" # Http port
      - "55679:55679"
    volumes:
      - "./otel.yaml:/otel.yaml"
    environment:
      - AWS_REGION=ap-south-1
      - AWS_ACCESS_KEY_ID=<put-your-aws-access-keys-id>
      - AWS_SECRET_ACCESS_KEY=<put-your-aws-secret-access-key>
    command: ["--config", "/otel.yaml"]
    restart: "no"

  fluent-bit:
    image: cr.fluentbit.io/fluent/fluent-bit:2.2
    container_name: fluent-bit
    ports:
      - "3000:3000"
    volumes:
      - "./fluent-bit.conf:/fluent-bit/etc/fluent-bit.conf"
    restart: "no"

  trace-generator:
    image: sharadregoti/trace-generator:v0.1.0
    container_name: trace-generator
    ports:
      - "5000:5000"
    environment:
      - OTEL_HOST_ADDR=fluent-bit:3000
    restart: "no"

This docker-compose.yml file defines a multi-container setup with three services:aws-adot, fluent-bit, and trace-generator.

  • aws-adot:
    • Uses the public.ecr.aws/aws-observability/aws-otel-collector:latest image.
    • Exposes ports 4317 (gRPC), 4318 (HTTP).
    • Mounts a local otel.yaml configuration file into the container.
    • Configures AWS credentials and region through environment variables.
    • Specifies a command to use the mounted config file
  • fluent-bit:
    • Based on the cr.fluentbit.io/fluent/fluent-bit:2.2 image.
    • Exposes port 3000 for log processing.
    • Mounts a local fluent-bit.conf configuration file into the container.
  • trace-generator:
    • Uses the sharadregoti/trace-generator:v0.1.0 image.
    • Exposes port 5000.
    • Uses an environment variable to specify the fluent-bit service as the destination for trace data.

Start Docker containers

docker-compose up

Generate traces by hitting the sample app

Open a new terminal and execute the below curl request to generate a trace:

curl -X GET http://localhost:5000/generate-hierarchical
or
curl -X GET http://localhost:5000/generate

Go to the AWS console and Open AWS X-Ray

You will observe a new trace is generated as shown in the below image.

Screenshot of a CloudWatch Trace view showing a newly generated trace with response time distribution data and traces list. The trace query, integrated with AWS X-Ray for distributed tracing, highlights one trace with a 200 response code and minimal duration.

Click on the newly created trace to view the detailed information about the request.

Screenshot of CloudWatch Trace details showing visualized segments timeline for demo-app, parent-segment, and grand-parent-segment with start times and durations. Utilizing AWS X-Ray for distributed tracing, logs for this trace are available.

Clean up

Execute the following to shut everything down:

# Press ctrl + c in the terminal instance where containers are running in foreground
docker-compose down

Conclusion

In this post, we’ve walked through the essentials of setting up distributed tracing with AWS X-Ray and Fluent Bit, demonstrating how to seamlessly integrate trace data collection and forwarding in a microservices environment. By leveraging Docker, AWS X-Ray, and Fluent Bit, developers can achieve a robust observability framework that is both scalable and easy to implement.

Learn more

To learn more about Fluent Bit, visit the project website or visit Fluent Bit Academy where you will find hours of on-demand training videos covering best practices and how-to’s on advanced processing, routing, and all things Fluent Bit. Here’s a sample of what you can find there:

  • Getting Started with Fluent Bit and OpenSearch
  • Getting Started with Fluent Bit and OpenTelemetry
  • Parsing 101 with Fluent Bit
  • Advanced Routing with Fluent Bit v3

Visit Fluent Bit Academy

We also invite you to join the vibrant Fluent community. Visit the project’s GitHub repository to learn how to become a contributor. Or join the Fluent Slack where you will find thousands of fellow Fluent Bit and Fluentd users helping one another with issues and discussing the projects’ roadmaps.

About Fluent Bit and Chronosphere

With Chronosphere’s acquisition of Calyptia in 2024, Chronosphere became the primary corporate sponsor of Fluent Bit. Eduardo Silva — the original creator of Fluent Bit and co-founder of Calyptia — leads a team of Chronosphere engineers dedicated full-time to the project, ensuring its continuous development and improvement.

Fluent Bit is a graduated project of the Cloud Native Computing Foundation (CNCF) under the umbrella of Fluentd, alongside other foundational technologies such as Kubernetes and Prometheus. Chronosphere is also a silver-level sponsor of the CNCF.

Share This: