Predict 2022: Cloud-native environments are heading toward a tipping point with observability data growth this year.
On: Apr 11, 2022
This year’s Predict ‘22 conference was packed with savvy forecasts about the year ahead, including one from Martin Mao, our co-founder and CEO. During his forecasting session, Martin spoke on a topic near and dear to Chronosphere’s heart: observability data growth, which he predicts will reach a tipping point in the next year.
Here are key takeaways from Martin’s talk, including four approaches to taming observability data growth.
Data growth – particularly observability data growth – is heading towards a tipping point. If we look at how the sheer volume of observability data has grown over the years – and we look at how companies have transitioned their architecture from on-premises to a VM-based cloud architecture to a cloud-native architecture – we see observability data growth is far outpacing the rate at which the general business – or the infrastructure footprint – is growing.
To explain how data will reach a tipping point, Martin illustrates an example scenario about an organization that a few years ago was running applications on VMs. Flash forward to today: “You’re running a Kubernetes cluster on those same VMs. You’re trying to go cloud native and you’ve broken up that application into microservices instead, and you’re running those on the containers on the same cluster.”
At this point, he explains, your infrastructure footprint and your bill for that cluster doesn’t really change that much:
Your business probably hasn’t changed much at this point. At the same time, there are advantages with this new cloud native architecture, such as:
Enter the data tipping point: One of the unfortunate side effects of a cloud-native architecture is that it produces a lot more data, Martin explains.
“There are tens of these containers running on top of every VM. And then on top of that, they’re very ephemeral. They’re changing all the time, and every time they change, it’s a brand new container. You can just imagine the sheer amount of data that produces, and it’s on the same infrastructure footprint.”
The same explosive thing happens with your single monolithic application, Martin explains, which has been broken up as tiny microservices. Each microservice emits an order of magnitude of data similar to the original application, and now you have a lot more of those as well: “So you can see very quickly how adopting cloud-native architecture results in so much more data being produced,” says Martin, explaining that the downside of this is, “as that data volume starts to grow, generally the cost of the observability system rises in a correlated manner” because:
Cost rises are correlated with the rise in observability data. This means that as the overall cost rises, the relative cost of observability-to-infrastructure is also much higher. “When we talk to companies, we hear that within the infrastructure bill, observability is around number two or three and it’s growing very quickly,” Martin said. The problem for businesses ultimately centers around lack of value, he said while noting:
“We believe that cost disparity will reach a tipping point that is going to force companies to do something about it,” said Martin.
According to Martin, the overall approach to solving the data growth problem is to understand the outcomes. This means inspecting the data, understanding what you are using the data for, and optimizing the data for those use cases.
Martin lays out four techniques to help with data growth: retention, resolution, efficient storage, and aggregation.
One way to efficiently manage data is through retention. To illustrate, Martin describes a typical environment in which there is one set retention period of 13 months for all of your data, whether you run your systems in-house or you use a managed provider. This means every piece of data that gets produced, no matter how you want to use it, is retained for 13 months.
“That may be useful for some pieces of data, but in the modern cloud-native architecture, where we are deploying multiple times a day, and a container is only around for a couple of hours, a huge amount of that modern observability data does not need to be retained for 13 months.”
Taking the retention example further, Martin explains:
“Retention is not just setting the retention across all of your data,” said Martin. “It’s understanding each use case – for each subset of the data that serves those use cases – and understanding the optimal retention period.”
The second data-optimizing technique – resolution – refers to the frequency data is being emitted. “Am I tracking the CPU every 10 seconds versus every minute versus every hour?” Martin breaks down how to think about resolution in two examples:
Martin notes: As with retention, people often make the mistake of using defaults, which range from 10 seconds to a minute. However, he says, it’s critical to “understand your sub-use cases, and optimize the data for resolution for that use case.
“If we are measuring every 10 seconds versus measuring every minute, there is a 6x difference in the amount of data that needs to be produced and stored.”
According to Martin, he and Rob Skillington – Chronosphere’s CTO and co-founder – saw first-hand the importance of efficient storage while they were running the observability team at Uber. Martin explains:
Last but not least – data aggregation is, according to Martin, the most effective technique for taming data growth.
Martin explains a common pattern, or problem, among companies he talks to: “They are emitting a ton of data and the data has a lot of dimensions on it.” It makes sense, he explains, because you do want to slice-and-dice your data by those dimensions at a certain point.” Martin explains a scenario around latency and how you want to:
“There are many ways that you want to slice-and-dice data, because when something goes wrong, you want to pinpoint things like: What’s been affected? Or where is something going wrong?”
However, Martin notes, while dimensions, or pivots, offer huge advantages, they also produce a lot of additional data.
The takeaway, says Martin: Understand the use case of your data and optimize it accordingly.
As the data growth problem heads toward a tipping point, companies are going to need to take an outcome-driven approach to observability.
Request a demo for an in depth walk through of the platform!