Today, no one even blinks before buying clothes online knowing that the return process is seamless with pre-addressed bags and USPS pickups right at your front door. While we still value the ability to “try before you buy,” overall our options have increased thanks to the wide world of e-commerce.
Now think about experiences you’ve had trying out a new observability vendor – it’s basically the polar opposite. Backing out of a wrong choice is not as straightforward as returning a sweater, and it’s common to become locked-in financially with a contract, and technologically with proprietary instrumentation libraries making the ability to leave a mere dream.
To be frank, the entire migration process between observability solutions isn’t straightforward and is highly dependent on your engineering teams’ efforts, system expertise, and enablement support. There is no magic “easy” button here and I am wary of anyone promising a way to “future-proof” anything in technology — the pace of progress and innovation is fast and ever-changing. Standardizing on instrumentation formats, common metadata, processing rules or templatizing dashboards and monitors company-wide can ease the burden of a migration — but these efforts often come too late and increase the scope of a migration project.
What is actualizable today is the ability to easily try observability solutions before you buy thanks to vendor-neutral open source instrumentation standards and increased interoperability between telemetry formats. OpenTelemetry’s savvy decision to support a wide range of data formats means that you can pipe all the signals and configure export destinations through one component, the Collector, instead of a multitude of signal or format-specific agents. Piping your existing telemetry through the OpenTelemetry Collector enables you to quickly ship data to multiple solutions, easing the first step of piloting new observability platforms today.
Observability migration pains of the past
Nobody has ever said: “I love migrating observability vendors.”
The fact of the matter is, no one really wants to embark on migrating between observability tooling, it’s a thorny project combining high visibility, criticality, and is an exciting way for your SRE team to discover unowned services.
Migrations are often driven by the need to reduce the financial cost, meaning your migration deadline hinges on the end date of your original contract. This leaves your engineering group tasked with exploring options, selecting a new solution, and completing the migration of data, dashboards, and monitors on a fixed timeline.
How did you wind up with an ill-fitting tool in the first place? Common reasons include:
- Power users become blockers – An early and influential engineer championed a particular product but hasn’t been able to enable or teach others, thus is the only power user
- DIY can’t scale – In the early days an engineer set up a cheap homegrown platform which was effective for awhile but hasn’t scaled as the business has grown
- Platform bloat – It used to be a great experience but over time platform bloat created a disjointed user experience and engineers spend more time navigating various sub-menus than actually troubleshooting
- Pricing bait and switch – Pricing schemes that are friendly when first implemented when the initial license covers everything. But upon renewal, must-have features move into a higher license tier, which forces you to pay more.
- M&A creates sprawl – An acquisition brought a new system that was already set up with a particular tool; siloed data makes integration efforts challenging
- One of your board members is closely affiliated with a vendor and is the final decider.
Getting from “let’s start a pilot” to actually being able to query data and fully evaluate the features of a solution was an exercise in frustration when proprietary instrumentation was a truly closed system. It meant finding a few teams that had time to add new instrumentation, getting the ops team to figure out how to configure, deploy, and operate some vendor-specific agent and components. Then hoping that engineers were intentional about spending time to learn, explore, and use the solution and provide helpful feedback to inform a go/no-go decision. If that wasn’t exhausting enough, you’d repeat this cycle for as many vendors and solutions you were willing to explore while navigating mysterious and obtuse pricing models to figure out if this would actually be cheaper than the previous solution.
The more time you have to spend on instrumentation, configuration, and set up, the less time there is to learn your way around a new product, spend quality time testing out common workflows, and give informed, quality feedback. Without engineers actually getting to use the solutions for real world investigations, you risk ending up with an ill-fitting tool and could wind up in this same position 2 to 3 years down the road after the immediate financial pain is soothed.
Owning your data destiny
What’s maddening is that telemetry is your organization’s data. It’s information about the performance of key features, customer experience, service health, infrastructure, and third-party dependencies. Not being in full control of how or where this data is sent is a major challenge when it comes to piloting a new observability platform.
Addressing this pain has brought together popular open source observability projects and all of the major observability/monitoring vendors driving to create industry standards and interoperability with OpenTelemetry. A great example is the recent work to merge the Elastic Common Schema (ECS) into OpenTelemetry Semantic Convention creating a single open schema for all, bringing more consistent signals across metrics, logs, and traces while also adding key security domain fields! Or the joint efforts between Prometheus and OpenTelemetry that enables each project to ingest or translate metrics between their different metric formats.
This era of collaboration and co-opetition is bringing OpenTelemetry’s stated goal of enabling effective observability through high-quality, ubiquitous, and portable telemetry closer with every release.
Why the vendor-agnostic OTel Collector is so compelling
The OpenTelemetry Collector is a vendor-agnostic way to receive, process and export telemetry data. Not just traces, but also metrics and now logs! This would all start to feel like yet another way to get locked-in if the Collector only spoke OpenTelemetry Protocol (OTLP), but the power comes from the choice to integrate with other data formats, like ones you’re likely already running. Options include Prometheus, FluentBit, StatsD, Zipkin, and yes even vendor-specific receivers.
In addition to handling exporting telemetry in a wide variety of formats, another powerful feature of the Collector is the ability to process data that’s passing through. From configuring sampling, to deriving metrics from spans, or enriching data with business-specific attributes, there are endless possibilities with Processors.
It is this combination that makes the OpenTelemetry Collector so compelling — the ability to receive and send a wide variety of telemetry formats. This removes the immediate need for re-instrumentation, which exposes many ways to interact with and enhance your telemetry on your terms, and enables you to configure when and where that data is sent and stored. In a world with portable telemetry, vendors can no longer rely on locking you into their ecosystem and leaving you to suffer from sweeping pricing changes and unpredictable bills.
Migrating with the OTel Collector
The first wave of early adopters have been finding success with the Collector, contributing back upstream to ensure it is production ready.
Earlier this year at the SCaLE conference, software engineer, Paak Quansah, presented PlutoTV’s journey with deploying the Collector in a part of a larger effort to standardize their approach to observability. Their set up will sound familiar to most — PlutoTV was contending with multiple instrumentation formats and pipelines and was looking to consolidate tooling and use a single vendor. What’s different is their experience while evaluating multiple observability vendors. Instead of the pain described above, installing the Collector enabled PlutoTV to send telemetry to multiple vendors without requiring any additional instrumentation or changes at the application level. The Collector only needed a configuration update to know where to send the data!
Another success story came from Adobe, which has many different engineering teams with different processes, libraries, formats, and a mix of observability vendors and open source solutions. It was quite a challenge for their SRE teams to manage this and they adopted the Collector. In the words of the Adobe team:
At Open Source Summit, Shubhanshu Surana, an SRE at Adobe who is focused on observability, presented OTel Collector: The Swiss Army Knife of Observability. During his presentation he said,
With the introduction of OpenTelemetry Collector, as well as the OTLP format, This made it super easy for us; we are able to send their data to multiple vendors, multiple toolings with just few changes on our side.
With these enterprises at scale pioneering use of the Collector, we can soon leave the days of painful observability evaluations in the past.
It’s easier than ever to explore observability options
There has never been a better time to explore your options for observability tooling than now. So is trying out alternative platforms with the Collector as easy as trying and returning a sweater from online? Almost. Thanks to the efforts of countless open source contributors and vendor collaboration, unlocking the power to route all sorts of telemetry through the Collector pilot pain is significantly lessened.
The evaluation phase is just the beginning and the partner you choose will influence how successful and smooth the rest of the migration goes. If open source observability is new to your organization, finding a vendor with deep expertise that can advise and support your SRE team, especially as they get things set up, helps get the data flowing.
A transparent pricing model makes engineering leaders, finance, and executives rest easy knowing there will be no unpleasant billing surprises. Stellar customer support means getting questions answered quickly and accurately. Variety of modals for training and enablement means more engineers can use the platform to effectively investigate and understand system behavior.
If that doesn’t sound like your current experience with an observability vendor, consider Chronosphere where we will pair with you every step on your migration journey.