KubeCon 2023: Martin Mao talks cloud native observability with theCUBE

Martin Mao is using a video camera to film a conference at KubeCon 2023.

Blog

The Cloud Native Computing Foundation’s (CNCF) KubeCon + CloudNativeCon is the premier event for all things cloud native, cloud computing, and open source. Which makes it the perfect place for Chronosphere CEO Martin Mao to chat about observability and how our platform can help curb data costs and cut through the complexity of cloud native environments.

On: Nov 20, 2023

18 MINS READ

At this year’s show in Chicago, Martin sat down with theCUBE’s Savannah Peterson and Rob Strechay to chat about the launch of Chronosphere Lens, observability industry trends, generative artificial intelligence, and how the company invests in its product.

If you don’t have time to watch the whole video, check out the transcript below.

In the spotlight: Chronosphere Lens and Change Event Tracking

Savannah Peterson: Good afternoon, cloud native community, and welcome back to Chicago. We’re here at KubeCon CloudNativeCon, CNCF’s largest North American event. My name is Savannah Peterson, joined here by my fabulous co-host, Rob Strechay.

I’m excited for our segment this afternoon. We have a CUBE alumni. He’s been on the show four different times, and coincidentally, also has had four funding rounds to complement that. Most recently, raising a $115 million Series C up round in January, which in this ecosystem, landscape, and economy is a serious accomplishment. Martin, it is so great to have you back on the show.

Martin Mao: Thank you so much. Looking forward to our conversation today, and hopefully we can keep that run going.

Savannah: There’s a lot going on with Chronosphere. Let’s start with the big product announcement that you had today. Yeah. What’s going on?

Martin: So we just announced Chronosphere Lens. It’s a new and more effective way for developers to interact with observability data. So you’ve heard of the phrase, “single pane of glass,” and that’s been the way observability vendors have approached a problem for a while now. But the problem with a single pane of glass is there’s so much data these days, just seeing all of it in one place is actually not that useful.

It’s almost too much data and developers are drowning in it. What Chronosphere Lens does is it actually analyzes all of the raw observability data underneath the covers and then extrapolates insights and knowledge from that raw data and presents that information to the developer in a context that they want to solve their problems.

We found it to be far more effective. And in fact, for Chronosphere customers, they’re finding that they’re reducing their SEV0, SEV1 incidents by about 75% or so. It’s a lot more of an effective approach to solving the problem than the standard [solution].

Savannah: That’s a considerable impact. I imagine it would have taken some calibration to, pun intended, figure out how to hone that lens to serve the developers the information they need in that moment.

Martin: Exactly. So now you know why we called it Chronosphere Lens, right? Because it’s all about focusing into the data and the insights that’s really relevant for an individual developer, right? Because remember, these systems we’ve built are really complex these days. And as a developer, I only own one part of the system. I don’t actually want to see what’s going wrong. There’s a lot of things going wrong with the whole system, right? I don’t actually want to see all of it. I just want to hone in on what I care about and my dependencies. And that’s what Lens does. It helps you hone in on that piece there.

Rob Strechay: It helps them focus. You also announced that there was new Change Event Tracking as well as part of that. And I think it’s to your point, bringing it all together and being able to coordinate that information.

Martin: A hundred percent. With Change Event Tracking, we’re tracking every change that happens to a system, a deployment, a new piece of infrastructure. Those changes there. To your point, for the whole system, there are a lot of changes, right? So what I really want to see is not just all the changes, what are the changes relevant to me as a developer? What are the changes that I need to know about?

Again [having that] in context helps me solve my problem. So we did add event tracking, but the trick with Lens is that we scope down and focus on just the events that are relevant for that particular incident there.

Rob: I mean, you know, a lot of times it’s garbage in, garbage out, and you’re drowning in data and things of that nature. So this really helps them focus on what they’re trying to achieve with their piece of an application. Be it, you know, a microservice or a container or, hey, if they’re more of the platform engineering type people.

Martin: 100%. It’s what they own, to your point, it’s a microservice or a piece of infrastructure there. But it’s also the dependencies, right? Because it’s not just what I own, it’s what I’m dependent on, it’s what’s dependent on me. So it’s more than just their tiny piece of the world, but it’s not the whole, it’s not the whole thing, right? That’s the focus there.

And then to help with the drowning in data, you can imagine the drowning in data is causing a whole bunch of cost problems for the industry right now. we are solving that problem at the same time, uh, as well there, but yeah. What we find is that when a company adopts cloud native, uh, what they see is on average, a 12.4x increase in the volume of data that’s being produced, right?

That’s an order of magnitude more data. The reason why people spend so much money on observability tooling, it’s not because the event has been. Really, it’s actually because there’s so much more data, these systems cost, uh, a lot more because of that. It’s one of these unintended side effects when you adopt cloud native, you know, you can imagine you’re not really paying for the cloud providers to run Kubernetes for you.

You’re really paying for the [virtual machines (VMs)] and the hardware underneath the covers. So the cost of the infrastructure is the same and yet you run a different architecture. The workload you can put through it is roughly the same, but your observability bill grows potentially 12.4 x. So that’s a huge problem that the industry’s facing right now.

What we are trying to do with that is actually help control the growth of data. We can’t just make it cheaper, unfortunately, because there’s diminishing returns on how efficient you can get the back ends, and the data is exploding in an exponential format. So it’s not just about making it cheaper.

The trick is to understand and show the companies and the customers what is costing you all of your expensive bill, what is causing all of that cost and out of the data, what is valuable and what isn’t valuable. And really trying to get people to match, “Okay. I want to spend money only on my valuable use cases, so how do I go and match those two things together and make sure I only spend money on my valuable use cases?” And we’re able to do that through a feature called the Control Plane in Chronosphere. And on average, our customers are optimizing their data about 60 percent and you can imagine therefore saving at least 60 percent on their observability bills there.

Rob: Yeah. And I think you’ve always been known for no overage contracts and things of that nature. How has that really been helping you in the past year since we talked last?

Martin: Yeah, it’s been helping. I mean, you can imagine the economy hasn’t been great in the past year, right? So, a lot of companies have been pressured. They don’t have an extra budget. The concept of an overage is really painful for a lot of companies, because you didn’t budget for it internally and originally. And now you have to go find a discretionary budget in order to cover the overage. We hate that model and we don’t believe in that model at all. So, in our product, there is no concept of an overage.

What we get companies to do is we use this Control Plane in order to control that growth of data and avoid overages there. And the behavior we actually saw with the companies that we work with this year is that they actually went back and used the tooling to go and optimize their data further as their budgets weren’t expanded. That’s a behavior we saw this year and we’re really happy that we’re able to help customers control both their data volume growth as well as their cost there.

Growing interest around cloud native and observability

Savannah: It makes sense that it’s built in and you’re using your own tool to help them navigate what that even looks like. When we chatted last year, you mentioned that observability was just beginning. Where are we at a year later here in Chicago?

Martin: I would say it definitely advanced a lot since last year, right? And I think the overall cloud native journey has, has advanced a lot. Like if you look at this show, this reminds me of maybe San Diego 2019. It feels like, you know, the buzz around cloud native is back. And we see it in the industry, especially around the enterprises, right? Like the enterprises have been thinking about this for years.

And in the last year, we’ve really seen the enterprise actually start to take action and actually shift workloads over to cloud native architecture. That trend has continued, and along with that, you really need observability and cloud native observability there. So that trend just continues.

Savannah: So what are we going to say a year from now?

Martin: I think that trend will only continue a year from now. I do think a lot more companies are going to run into a lot of these cost challenges, because again, as you make this transition, the data loads are going to grow and there’ll be even more pressure and costs. I don’t think we will turn back to a 2021 economy. I think there’s a new level of efficiency expectation around, so I think that trend will continue. I think there’s going to be a need for tools that are much better at focusing on the problem and being more effective at solving the problem because that’s just a really bad trend in the industry as well.

Rob: I mean, one of the things that we hear from organizations is, “hey, I don’t have an observability problem. I have 10 tools in this space.” And I think what a lot of them are looking for is really a platform. It seems like that’s really the direction you’re going and the approach you’re taking.

Martin: It’s about the platform for sure, because you need to consolidate those tools down. But even when you do that, but even when you do that to perhaps one platform, you don’t just want one platform in a single pane of glass. It actually goes a little further than that as well, and hence Chronosphere lands there, right? So, we’re trying to be two steps ahead of where people ultimately need to go.

What about artificial intelligence in observability tools?

Savannah: If you’re two steps ahead, where are we going with generative AI?

Martin: That’s a great question. I think there is a lot of buzz in the space. And what we found when we played around with it, just like most other companies there, is that the public models are interesting, for sure. But the problem with the public models is they’re never built on your company’s data, right?

So, if you think about observability, what you want to know is: What are the issues with my system? And the public models are not trained on your company data, so therefore they’re not quite as effective.

We started down this path, and it’s actually matches with what we’re doing with Chronosphere Lens, which is analyze the raw data and build, not quite a vector database, but build a knowledge graph on top of the raw data, and have that, which is specific to a particular company, go fuel the insights that you want to go present.

And then when we built that piece, we thought about it, and we’re like, actually, you know what, the chat interface is interesting for some use cases. For the observability use case, it’s actually not the best because you don’t actually want to ask it proactively, like, what’s wrong with my system. It’s much better for the tool just to tell you, here’s what’s wrong with my system.

Savannah: You want it constantly giving you that information that’s necessary.

Martin: Exactly, and perhaps the text interface is not the best. You know, you really want it as a visual thing of like, this is what’s wrong and let me show you what’s wrong there. What we found is actually [Chronosphere] Lens is a much more effective way of presenting that information. It’s a similar dynamic to what a lot of these AI models are. But we just don’t think chat is quite as effective as the main interface to observability. But we did do a lot of playing around there, and I do think, you know, there is a general trend in that direction for sure.

Rob: Yeah, and you don’t have to do all the prompt engineering as a person. As we have our own [language learning model (LLM)], and I can tell you that I’ve gotten very good at writing prompts.

Martin: Exactly. You have to be really good at it, and you have to know what questions to ask, right? Whereas, if [observability tools] can just tell you what the answers are, you don’t need to figure out what questions to ask, right? The whole point of observability is we want to show what’s wrong with the system rather than have you ask, “Well, is it this? Nope. Is it this? Nope. Is it this?”

Savannah: And troubleshooting when you’re in a moment of panic too and something’s not working. The last thing you want to do is be trial and erroring.

Martin: Exactly. Hence, you can imagine an interface where it’s like, “Look, we know who you are. We know why you’re here. Cause you’re not just randomly perusing this tool. Like you’re here for a reason. We just paged you and here are some of the areas that are wrong.” Practically presenting that information in a very curated way is what we found to be a more effective approach.

Industry happenings and open source standards

Rob: It would seem like there’s definitely consolidation in the industry. I mean going on and it would seem that that makes a lot of sense because if you looked out here a couple of years ago, it was almost all observability companies. Now that’s shrank down a bit, [but] there’s still quite a good core. How do you see the openness that’s going on here [and ]in your company?

Martin: Yeah, I think the openness and industry standards like OpenTelemetry, things like Fluent, FluentBit, and FluentD, I think that really actually opened up the industry quite a lot and actually made room for new companies like Chronosphere to enter because you are now no longer, like, locked into these proprietary agents that are producing the data for you.

It’s actually better for the world overall because, you know, as a company, I now am not locked into one particular vendor. I can actually instrument and own the data creation myself, and I can pick whichever tool I want. It also opens the door for new players like Chronosphere to enter, and really reduce the moat of the incumbent players that were here. Then of course, earlier this year, a lot of the big companies out there were taken private. So that paves the way for perhaps new companies to go take their spot [and] that’s something we’re excited about.

Partnerships and cloud native players

Savannah: I suspect there’s probably a little bit of privacy around this, but what are some of the partners and players that you’re getting to support on their journey?

Martin: Yeah, for sure. On the partnership side, what we look at is from a problem perspective, what are problems that our customers can solve using our tools? And then what are other problems that they can’t solve? Because we can’t solve everything under the sun, right? There are certain things that we can’t do well, right? So, we really want to find complementary partners there. And we’ve found great partnerships with particular companies like CrowdStrike out there. We have a good partnership with them.

As a smaller company, it’s more effective for us to partner with the best-in-breed solutions out there. I do think the good thing about our partnership is we’re also looking for companies that are really aimed at the future of cloud native environments as opposed to non-cloud native environments. So as a lot of good partnerships, a lot of them are on the floor here at the show, a lot of good partnerships for us from that perspective.

Savannah: And I bet everybody wants to work with you with that 115 million purse. What is the capital going to unlock for you in terms of your ability to scale?

Martin: Big investment in the product. To your point, we don’t do everything yet, but we will, soon. I think a lot more investment in the product because companies are looking for platforms. They’re not looking for 10 different solutions. They’re looking for fewer and fewer. They’re looking for consolidation. So we do want to expand our product suite there. We do want to make sure we stay ahead. So, you know, the AI conversation was very interesting. We want to make sure we invest enough in the right ways to stay ahead and not get caught. So a lot of investment there, a lot of investment also on the go-to-market side as well, you know, you can imagine the popularity of the product and the company is growing as well. That has to fuel that growth there. So more capital for growth, I would say.

The three C’s: context, control, and confidence

Rob: I think what I’ve liked about it and have been briefed by you guys before is that you really focus on confidence, control, and context as kind of the three pillars for your roadmap. And how does that help you with focus as well?

Martin: Exactly. And you put it in better words than I can. Those are exactly the three things that I will begin with C, right? To your point, it’s about focusing on those three areas, right? Because when we looked at the problem, there wasn’t a need for necessarily just another tool. There were enough tools out there, right?

We really looked at what everyone is struggling with and what everyone’s struggling with is cost and controlling the data growth. So that’s one area we want to focus on and we’ve been focusing on it for years already.

Second one is context because the effectiveness of these tools just weren’t there anymore for the new environment.

And third one is confidence and reliability. Observability. is the thing that’s telling you how reliable your product and service [is]. So if [your observability] isn’t reliable, there’s no way you can be reliable as a company, right? So that was a cornerstone of the company.

Those are honestly the three main strategies there. Everything we invest in has to fall in one of those three categories there. And that’s what allows us to stay focused. So while we look at other things that have to be aligned, like the Chronosphere Lens piece, it’s aligned with the context, right? If it’s not [about] context, if it’s not about control, if it’s not about confidence, we don’t really make those investments.

Rob: Well, having, having been a product guy, I appreciate focus and pillars. It gives you that North Star to aim for.

Savannah: It’s exciting. So [to wrap up], what does it mean for you and the team to be here at KubeCon?

Martin: Look, this is our favorite event all year long; we love it. And again, as I mentioned earlier, this year feels reminiscent of KubeCon San Diego, [which was] pre-pandemic. It’s not like the trend to cloud native has slowed down at all, but I think the in-person energy from shows like this hasn’t quite returned yet. Last year we got close. This year I really feel it. You know, it’s a great reinforcement of how far we’ve climbed in the cloud native space and the fact that this is still the future that the whole industry is moving towards.

Get started with cloud native