Control data from collection to backend destination
Telemetry pipelines allow organizations to perform in-stream data processing, filtering and routing of data to multiple destinations at scale. As a result, organizations can control their data from collection to backend destination, alleviating the complexity of managing multiple pipelines, reducing the volume of data being sent to observability and SIEM platforms, and decreasing backend costs.
Adoption and deployment of telemetry pipelines are on the rise. A 2023 Gartner® report noted more than ten vendors in the telemetry pipeline space, compared to zero in 2018. The Gartner report also noted that the number of client inquiries about telemetry pipelines increased 500% between 2021 and 2023. Furthermore, Gartner forecasts that 40% of all logging will utilize a telemetry pipeline by 2026.
While the telemetry pipeline is new to many, contributors and maintainers of the Fluentd and Fluent Bit projects have been working on this problem for more than a decade. The Fluent projects are part of the Cloud Native Computing Foundation (CNCF) and have achieved graduated status.
Eduardo Silva, Staff Engineer at Chronosphere and the original creator of Fluent Bit, shares “Companies were looking for a way to build and deliver holistic observability initiatives. At the same time, the move to Kubernetes and microservices increased the complexity and cost of implementing these initiatives. We knew that organizations needed the ability to manage data from multiple sources to their observability and SIEM backends, they needed a telemetry pipeline. That is why we built Calyptia.”. (Many also know Eduardo Silva as the founder of Calyptia. Calyptia was acquired by Chronosphere earlier this year, enabling us to double down on our efforts to support the Fluent communities).
Over the past decade our team, including several Fluent Bit maintainers, has spoken with hundreds of SREs, observability engineers, DevOps managers, and others who are working to build holistic observability programs, reduce logging infrastructure complexity, and manage the costs of their observability and SIEM platforms. In this post we share a list of the top requested requirements for a telemetry pipeline.
Top requested requirements for a telemetry pipeline
#1 Support for open standards and libraries, including OpenTelemetry and Prometheus
Over the past decade, open standards have significantly tamed the Wild West of telemetry data. Open standards ensure compatibility across different systems, tools, and platforms. This interoperability allows for seamless integration of various monitoring and observability solutions within your infrastructure. Open standards also enable you to break free from vendor-specific collection agents.
#2 Route data to multiple destinations
The ability to route data to multiple destinations is core functionality for a telemetry pipeline. Organizations have multiple uses and requirements for their data. Some data may require long-term storage for compliance, while some data may need to be available for analysis by a SIEM platform. Teams may not have settled on a unified platform and may need to send logs to one application and metrics and traces to another. Telemetry pipelines make multi-routing simple, allowing engineering teams to devote their attention elsewhere.
#3 Add tags/attributes to telemetry data
Tags are key-value pairs of data that provide contextual information. In OpenTelemetry, they are called attributes, but the concept is the same. Tags are generally applied to the data upon collection and are used to distinguish and group telemetry data. They help the pipeline identify what actions it should take with the data — what processing rules should run, which filters should be applied, where the data should be routed, etc.
#4 Route data based on content, tags, or lookup
Your telemetry pipeline should be able to route data based on metadata added as tags/attributes, the content of the data itself, or based upon lookups performed with an external data source. Doing so allows you to apply complex business logic when determining where to route your data.
#5 Enrich, redact, or mask data in stream
Telemetry pipelines sit between the data sources and the data destinations. Consequently, they can access information about the sources that would typically be unavailable further downstream. They can then enrich the telemetry data with this additional context before routing it to its final destination.
This ability to enrich event data with metadata is particularly important for Kubernetes deployments, where ephemeral information such as pod names, IDs, and labels are not included in typical application logs. This metadata is useful in reducing mean time to remediate (MTTR).
Telemetry pipelines can also identify and remove or obfuscate sensitive data that should not be available to downstream systems or storage. This could include personally identifiable information (PII), credit card numbers, etc. Many industries require such redactions be performed at source, and moving unredacted data is a compliance/accreditation failure. But even for industries without such strict standards, the ability to redact data significantly reduces risk should a data breach occur.
#6 Filter and/or drop events based on content
One of the most powerful features of a telemetry pipeline is its ability to process the data in flight and apply logic based on its content. Stories abound about organizations being hit with unexpectedly high bills from their observability or SIEM platforms because a developer inadvertently left debug mode on, causing ingress charges to soar. A simple rule applied in the telemetry pipeline can drop all debug messages rather than routing them to a backend.
Similarly, your pipeline should be able to apply filters to the data, enabling you, for example, to strip out unnecessary key-value pairs or otherwise modify the data to fit your needs. It should also be able to identify and drop duplicate records and reduce high cardinality metrics to control costs.
Ideally, your pipeline should be able to report how any changes impact the volume of data being routed to your backends.
#7 Abide by the principle of least privilege
If you don’t require that all of your purchased solutions be capable of operating under the principle of least privilege, you should. Your telemetry pipeline platform should not be an exception. Ensuring that users and applications have access only to the data and operations they require to perform their functions ensures that the blast radius is contained should a compromise occur. Even if you are not currently architecting for zero trust, improving your security posture by reducing unnecessary access to systems helps reduce your attack surface.
#8 Manage agents as fleets
Complex infrastructures may require deploying hundreds, even thousands, of agents to collect telemetry data. GitOps has certainly simplified the process. However, the team that manages the telemetry pipeline may not be the same team responsible for GitOps. In such cases, even the smallest configuration changes are dependent on another team’s priorities, workflows, and workloads. The ability to manage collection agents from within the pipeline platform empowers teams to act swiftly and adds flexibility.
#9 Automated operations
Your pipeline should be able to take advantage of cloud native best practices to support operational efficiencies. Specifically, it should include:
- automatic or one-command scaling
- automated load balancing
- support for automatic retry of data
- persistent storage to prevent data loss
- support for self healing
If you are considering adding a telemetry pipeline to improve your monitoring and observability strategy your telemetry pipeline should be built using the best practices that you apply to the rest of your infrastructure.
Continue learning about telemetry pipeline management
We hope that this post has been helpful as you build your telemetry pipeline strategy. For more insights into telemetry pipelines check out these on-demand webinars:
- How to transform your logs to meet your observability and security needs
- How to manage collection agents at scale
- How to transform your metrics and traces at scale
To learn how to simplify your telemetry data collection and processing, schedule a demo.