Real talk: Why does Datadog cost so much?

A hand is holding an expensive stack of money in front of a green background.

Blog

Cloud native architectures are churning out more data, increasing the cost of monitoring tools like Datadog. We lay out better ways to manage these expenses in this blog.

On: May 15, 2024

7 MINS READ

Lately, I’ve come across numerous discussions on X (formerly Twitter), Reddit, and HackerNews about the steep costs associated with Datadog. This topic has become so prevalent that engineers are sharing their strategies online for aggressively reducing metrics.

But what led us to this point? What is making these costs skyrocket? Why do some companies spend more on observability tools than on their actual production infrastructure? Many point fingers at issues like vendor lock-in and corporate greed, which undeniably play a role.

Yet, there’s a deeper problem stemming from the shift towards containerized infrastructure and microservices applications. Without addressing this foundational issue, we are doomed to repeat these mistakes.

Disclosure: I work for a Datadog competitor

Indeed, I work for Chronosphere, a rival to Datadog. Rest assured, this article isn’t a sales pitch. Datadog is a formidable competitor, having established a strong business over the years.

I previously worked at a company closely partnered with Datadog from 2015 to 2018, during which we witnessed its impressive growth and aspired to emulate it. However, I also observed growing frustration among Datadog’s customers over their escalating, unpredictable costs, feeling trapped with no exit.

This observation influenced my decision to join Chronosphere in 2021, anticipating that this trend was coming to a head. Before entering this field, I analyzed the market and discovered that for every dollar spent on public cloud services, about 25-35 cents goes to observability — a market primed for disruption.

The primary driver behind Datadog costs: data growth

The primary driver behind soaring Datadog costs is the sheer volume of observability data — metrics, logs, traces, and events — far exceeding initial predictions. Datadog’s pricing model and architecture were not designed to handle such volumes. Several factors contributed to this data proliferation:

Business factors:

Digital transformation has integrated more technology across various business sectors, increasing the amount of data needed to monitor system health and ensure smooth operations.
Higher consumer expectations demand more robust, always-available services, which in turn require more granular observability data. The 2023 Online Reliability Report notes that Americans generally switch brands after less than four disruptions on a digital platform, so the stakes are high.
Data hoarding is common because the constant influx of data makes it hard to determine what might be necessary later, leading to excessive data retention. It can be tough to know what data is useful when you’re getting so much of it on a second-by-second basis.

Technical factors:

Containers and microservices bring significant benefits in cloud native environments, yet they also inherently generate a lot more telemetry data. This is due to the need for detailed monitoring of each component and service’s health. Now, every container and microservice generates as much observability data as what was once produced by each virtual machine (VM) or traditional application. Instead of managing just a few VMs and several applications, organizations now deal with thousands of containers and numerous microservices, leading to astronomical increases in data volume.
The growth in cloud native architecture scale is due to its decentralized design, which allows engineering teams to rapidly deploy various components. This leads to a sharp increase in the number of services and containers, and consequently, the volume of data produced.

What’s the result of all this data growth? Increased observability costs that are less predictable and don’t deliver any additional value.

Why not simply reduce Datadog pricing?

I suspect there are two reasons for this:

Shareholder value: Over recent years, Datadog’s shares have seen remarkable performance. If Datadog pricing was reduced, it would directly diminish its revenue stream, which in turn would affect reported earnings and potentially lead to a decrease in its stock price.
Cost of goods sold: Datadog has updated its architecture three times, with the latest version, Husky, being released in 2022. This latest update was aimed at improving efficiency but did not lead to a reduction in pricing. This likely means it helped lower the COGS and improved profit margins. Given that Datadog is unlikely to undertake another major architectural overhaul in the near future, it appears reluctant to reduce its pricing and risk its margins.

Escape high Datadog costs: Alternatives to explore

If you’re considering options beyond Datadog, there are a few paths you can explore:

#1: DIY open source

Managing your observability in-house using open-source tools is an appealing option. For metrics and traces, open-source solutions like Prometheus and OpenTelemetry, along with time series databases such as Mimir, Thanos, and M3, have evolved into widely recognized standards and present a feasible alternative to Datadog.

However, it’s important to understand that this route might not lead to actual cost savings — it essentially shifts capital expenses to operational expenses. The human and infrastructure costs required to maintain these systems are substantial, and economizing excessively could lead to future complications.

For instance, a former colleague recently transitioned his company from a costly commercial SaaS product to an open-source framework. He found that while the move appeared cost-effective on paper, about 8% of the development staff was now committed full-time to managing this system, offsetting any real savings.

#2: Next-generation observability tooling

Here, I’m not promoting Chronosphere, but highlighting that modern tools are being developed with the anticipation of data growth right from the start. These tools put the control of costs back into the hands of the users, ensuring there are no unexpected charges.

Similar to how Datadog and New Relic took over from older systems like Solarwinds, BMC, and CA Technologies, this newer generation of observability tools is beginning to emerge prominently. Engage with these providers to learn how they address the challenges of managing large volumes of observability data effectively, compared to merely applying short-term fixes to temporarily fix pricing pain.

Migration Timeline Estimator

Get an estimate of how quickly your migration can happen with our Migration Timeline Estimator tool.

Calculate

Conclusion

The steep costs and vendor lock-in associated with Datadog have become reluctantly accepted by many; it’s recognized as a necessary component of observability despite uncertainties about the full range of available options. Datadog, with its established presence, continues to appear as a solid choice, even in light of its pricing strategies and proprietary software. However, it doesn’t have to stay this way.

As observability evolves, new players are introducing solutions designed to effectively handle the complexities of large-scale data from the outset. These alternatives offer enhanced flexibility with your infrastructure, greater control over your data, and increased transparency regarding your expenses. This evolution in the market is paving the way for observability teams to adopt tooling with pricing models that still make sense, even as you scale.