Real talk: Why does Datadog cost so much?

A hand is holding an expensive stack of money in front of a green background.
ACF Image Blog

Cloud native architectures are churning out more data, increasing the cost of monitoring tools like Datadog. But there are better ways to manage these expenses.

Rachel Dines
Rachel Dines | Head of Product & Solution Marketing | Chronosphere

Rachel leads Product & Solution Marketing for Chronosphere. Previously, she built out product, technical, and channel marketing at CloudHealth (acquired by VMware). Prior to that she led product marketing for AWS and cloud-integrated storage at NetApp and also spent time as an analyst at Forrester Research covering resiliency, backup, and cloud. Outside of work, she tries to keep up with her young son and hyper-active dog, and when she has time, enjoys crafting and eating out at local restaurants in Boston.

7 MINS READ

I’ve seen so many X (formerly known as Twitter), Reddit, and HackerNews threads lately discussing the high costs of Datadog. It’s such a hot topic that engineers are posting blogs about their approaches to brute-force drop metrics

But how did we get here? Why are these costs so high? Why are companies paying more for their observability than their production infrastructure? There is a lot of finger-pointing and claims of lock-in and corporate greed, which are certainly partly to blame. 

There is a bigger underlying issue: The fundamental architecture changes that come with adopting containerized infrastructure and microservices applications. If we don’t understand and address this issue, history will repeat itself.

Disclosure: I work for a Datadog competitor

OK, it’s true, I work for Chronosphere, a company that competes with Datadog. I promise this article will not pitch you on our product. Datadog is a strong competitor, and I’ve watched it build an amazing business for years. 

My previous company was a close Datadog partner from 2015-2018, and we watched its meteoric growth, which we desperately wanted to emulate. At the same time, I watched Datadog customers get more and more disgruntled with skyrocketing and unpredictable costs, yet they felt they couldn’t leave. 

This was part of what drove me to join Chronosphere in 2021, as I saw this trend coming to a head. Before I joined this space, I did some market sizing and analysis and determined that observability had the biggest attachment to infrastructure spend: For every $1 you spend on public cloud, you’re likely spending 25-35 cents on observability. This struck me as a market ripe for disruption. 

The real culprit behind high Datadog costs: data growth

The root cause of the problem is simple: There is a lot more observability data (metrics, logs, traces, and events) than these tools ever predicted. As such, they are not architected for this data volume nor priced accordingly. There are multiple reasons we ended up with so much data. 

Business drivers:

  1. Digital transformation: The infusion of technology into more business sectors naturally comes with more data to oversee system health and ensure smooth overall system operations. 
  2. Higher customer expectations with greater stakes: According to the 2023 Online Reliability Report, on average, Americans tolerate fewer than four instances of unreliability or outage on an app or website before switching to a competitor. Operating high-performance and highly available services that deliver an exceptional customer experience requires more granular observability data.
  3. Data hoarding: It can be tough to know what data is useful when you’re getting so much of it on a minute-by-minute basis. Without the right tools to parse it, you can get into this trap of “I never know when I’m going to need this data,” and hang onto much more data than necessary.    

Technical drivers: 

  1. More telemetry data generated by containers and microservices: Cloud native environments (i.e containers and microservices) have significant advantages but naturally produce more data because you need to monitor each individual component and service’s health. For example, each container and microservice now emits as much observability data as each virtual machine (VM) and monolithic app used to. But now instead of dozens of VMs and a handful of apps, you have thousands of containers and dozens of microservices.
  2. Scale of some cloud native environments: By design, cloud native is decentralized – and engineering teams can quickly spin up components – which means an exponentially growing number of  services and containers generating data.   

This data growth causes observability spending to skyrocket. Without changing pricing models or software to account for data growth – and keeping pricing based on legacy monitoring standards – cloud native architectures suddenly became shockingly expensive to run.

Why can't Datadog just lower its prices?

I suspect there are two reasons for this:

  1. Shareholder value: Datadog’s stock has performed phenomenally over the last several years. If it lowered prices, it would immediately impact revenue, which would impact the earnings reported, which would drop the stock price.
  2. Cost of goods sold: Datadog has gone through three architecture generations, with its latest, Husky, just released in 2022. This rearchitecture was primarily focused on efficiency, yet didn’t reduce prices, so I assume it contributed to reducing cost of goods sold (COGS) and getting margins to a healthy place. Since Datadog probably won’t invest in another rearchitecture very soon, it won’t be compromising its margins by lowering prices.

Alternatives to Datadog

There are a couple of options if you don’t want to pay for Datadog.

#1: DIY open source

One attractive alternative is running your own observability in house with open source tools. The good news is that, at least for metrics and traces, open source tools have come a long way and are coalescing into industry-accepted standards. Prometheus and OpenTelemetry with a variety of time series database backends (Mimir, Thanos, and M3) are a viable alternative to Datadog. 

But it’s important to note this typically won’t save you money in real dollars. It’s simply trading CapEx for OpEx. The human and infrastructure cost of running these systems is non-trivial, and if you try to cut corners, you may regret it. 

I was talking to a friend recently who moved his company off an expensive commercial SaaS offering to in-house open source tools. He admitted that the company isn’t actually saving any money when it accounts for the fact that around 8% of his developer headcount is now dedicated to running this system.

#2: Next-generation observability tooling

This is not the part where I pitch Chronosphere. This is where I’ll say tools are being built with the underlying assumption of data growth from the start. The cost of the solution is always in the hands of the customer, so you don’t get surprise overages. 

Just as Datadog and New Relic and similar tools displaced the previous generation of Solarwinds and BMC and CA Technologies, this new generation of observability tooling is starting to make waves. Talk with these vendors and understand how they are handling the problem of too much observability data from the source versus bandaging over it with better unit economics.

Conclusion

Datadog’s high cost and vendor lock-in have somehow become a necessary evil; you know you need observability, but you’re not sure of all the options. Datadog has been around long enough that it seems like a viable option, despite its billing practices and proprietary code. But it doesn’t have to be this way.

As more observability companies enter the space, so do options that are built to address high cardinality data growth from the beginning. Ones that give you more flexibility with your infrastructure, greater control of your data and more visibility into your monthly bill, and ultimately set observability teams up for a more sustainable and cost-effective operations model. 

Share This:
Table Of Contents