By Amanda Mitchell, Sophie Kohler, Eric Schabell, and Scott Kelly
KubeCon + CloudNativeCon North America 2022 is here, and the Chronosphere team has officially touched down in Detroit. First day excitement is definitely in the air, and we’re so excited to make this year’s KubeCon 2022 legendary. We can’t wait to meet as many of you as possible. But in lieu of experiencing the KubeCon chaos in-person, fear not – we’ll be live-blogging here this entire week so that you can stay up-to-date with all of the latest happenings.
✅ Keynotes: We’ll be listening in and reporting back on highlights from keynotes every day of this week.
✅ Sights and sounds: We’ll be on the lookout.
✅ Follow Chronosphere on Twitter: Did we miss anything good? Go ahead and ping us on Twitter (@Chronosphere) if you have something to add (fun pictures encouraged).
Now, let’s see what’s happened so far…
Friday Oct 28, 2022
It’s been a whirlwind of a week. From the booth crawl to Barcade night and all of our sessions, the Chronosphere team had a blast. As we enter our final day at KubeCon 2022, we still have so much to share. So, let’s dive in.
In the DoorDash’s Journey From StatsD To Prometheus With 10 Million Metrics/Second session Emma Wang from DoorDash and Benjamin Raskin from Chronosphere discussed the challenges DoorDash was facing with StatsD and why they needed to find another solutions. Key challenges with StatsD included limited support for tags, the number of metrics that scale with user traffic, and lack of histograms, to name a few. DoorDash’s requirements for their next solutions included:
✅ Use of open-source
✅ Having standard conventions for tagging and naming
✅ Ability for self-service
Prometheus met these needs. To ensure scalability and reliability, they turned to Chronosphere, an observability solution that is 100% compatible with Prometheus. While the migration from StatsD was challenging, they now have a stable, scalable, and flexible observability solution that provides a greatly improved view of DoorDash service health. Check out more in this DoorDash case study.
“When DoorDash initially started this migration, they had more than 7,000 alerts, 1,500 dashboards, and metrics coming from about 130 services. This was a massive collaboration across all teams and engineers – specifically the central observability team. Fast forward, post-migration, DoorDash now has about 2,600 alerts and over 2,000 dashboards. Although the name of this talk is ’10 million metrics per second’, as of this week, we have reached nearly 15 million metrics per second.”Benjamin Raskin, Solutions Architect at Chronosphere
ICYMI – Chronosphere KubeCon 2022 action shots – Final Day!
Last but definitely not least, if you’ve checked into our virtual booth for a chance to win an Atari 2600 Lego Set, or our grand prize – a $3,000 Ultimate Gaming Setup, we are drawing winners next week, so stay tuned!
Thursday Oct 27, 2022
Day 2 of KubeCon is here! (Have you had the chance to become an O11Y Legend yet?) We can’t talk about today without talking first about what happened last night.
Who doesn’t like to play a classic arcade game? About 400 people, that’s who! After the show closed last night, a few hundred of us boarded two buses and descended on Barcade “the Original Arcade Bar” in downtown Detroit to play a huge variety of video games and pinball machines – think Street Fighter, Donkey Kong, Frogger, Super Mario, RoboCop, Daytona USA, and even Centipede! Drinks were flowing, shouts of victory were heard, and hearts were being broken for those trying to best the machines. It was really fun and a visit to Barcade near you is highly recommended.
Keynote – Day 2 KubeCon 2022
Thursday morning, day two of KubeCon and CloudNativeCon 2022 in Detroit opened with Emily Fox from Apple taking the crowd through multiple CNCF project updates that were not discussed yesterday.
This was followed by AWS (Amazon Web Services) sponsored talk about how they contribute to a long list of CNCF projects. Nice to see that a large vendor takes the time to contribute back to the community projects they use.
Next up was a recorded demo keynote by VMWare with a silly use case of building a rainbow selling application and how cool it is to build, deploy, and run on cloud native infrastructure. Everything was diagramed out with cartoon-like images such as a small child would draw. The demo is pretty much done from the command line in a terminal, catering to the deeply technical audience. The theme to this story is that you can do all this with their Tanzu platform tools.
An intermezzo session followed where our CNCF hosts again shared a series of CNCF project updates, just like she did in the opening.
Intel’s Cathy Zhang followed with a story about how they are taking silicon on a journey all the way to serverless computing. This was just a high level overview of how Intel is providing hardware for cloud native infrastructure.
Suse Distinguished Engineers Erin Boyd and Matt Farina are next with a journey into what the CNCF Technical Oversight Committee (CNCF TOC) does and how it works being a member of this committee. They also shared how the process works to enter a project into the CNCF ecosystem.
The final keynote of the day was presented by Ricardo Rocha, a computing engineer in the CERN cloud team, sharing a set of tools they have used over time in their cloud native efforts. They like to call it a cloud native swiss knife and is all based on the various open source CNCF projects. He demos a bit with the simple tools he likes, a wonderful idea for presenting a 101 level overview of the cloud native experience.
The keynotes were closed out with our CNCF hosts sending the crowd off into their day of sessions, hands-on events, and other KubeCon learning opportunities. Also, not to be forgotten, tonight is the networking party!
Wednesday Oct 26, 2022: Hello KubeCon 2022!
Hey there, KubeCon! First day excitement was definitely in the air as we kicked off KubeCon 2022 with thought-provoking sessions and fun and games!
First up – a first-hand recap of the day-one keynote sessions
KubeCon and CloudNativeCon opened today with the keynotes first up on our agenda. The morning kicked off after breakfast with Priyanka Sharma, Executive Director CNCF welcoming everyone, all 176,360 contributors and seven million end users of the CNCF ecosystem. Her message was one of hope after the trying time behind us, and as some believe, ahead of us. Economic challenges on the horizon are not going to slow down cloud native technology investments. Priyanka shared some encouraging numbers about investment, financial commitments, and growth of IT over the coming years. A nice end user story was presented of how CNCF projects are leveraged to build a fully autonomous taxi service in California. A final take away has to be that with +1000 maintainers of CNCF projects providing solutions for over seven million users, the cloud native community is stronger than ever!
The second keynote of the day was by VMWare talking about their contributions to the CNCF with a new project Carvel. A story about how their platform teams were having troubles with complex deployments and all that comes with that. They feel that the only way to successfully automate Kubernetes deployments was to use GitOps and featured their usage of various CNCF projects to achieve this.
This was followed by recorded updates from the graduated CNCF projects. These were short five minute blurbs on their progress over the last year. This was closed out with a surprise visit by a Ukrainian CNCF member who thanked the community for their continued support.
Next up, a story about fostering Kubernetes community growth through learning by Le Tran from Kasten by Veeam. A story about building bridges for everyone to cross into the Kubernetes world. She toured some of the sites, communities, and people that put efforts into providing quality tooling, content, and learning paths for personal growth in the cloud native world. She announced an open commitment made by introducing KubeCampus.io, check it out!
Ayse Kaya from Slim.AI shared a research project where they dissected the top 100 most used container images in the cloud native ecosystem. She is a data scientist who scanned over 900k container images and shared what they found. Data points like 60% of public containers have more security vulnerabilities than one year ago, while 70% of users demand zero vulnerabilities. The story was basically that:
- Kubernetes is complex and hard
- Developers are challenged to create security images
- All is not lost, as the use of containers and Kubernetes is still growing
- And the focus on supply chain security in image building processes has become commonplace in many organizations
As you can see, the cloud native ecosystem is vibrant, alive, maintained, and becoming even more central to all manner of commercial organizations around the world. The user growth of the CNCF project shows recognized potential; from +3.9M in 2020, to +5.3M in 2021, and at +7M this year. Cloud native is real and the future is bright.
Google and Chronosphere share the stage
Multi-Cluster Stateful Set Migration: A Solution To Upgrade Pain sounds like a mouthful, but leave it up to Peter Schuurman from Google and Matt Schallert from Chronosphere to pair it down so that it just makes sense. The two dove into the complex patterns developed at Chronosphere to safely migrate stateful workloads to coordinate maintenance operations for thousands of pods across multiple zones and regions. They also discussed a new enhancement to Kubernetes called StatefulSet Partition which is integrated into a multi-cluster deployment like Chronosphere’s and how this can dramatically simplify their operations to focus instead on core business logic.
Matt kicked things off by explaining what Chronosphere does and how we use Kubernetes, saying “We’re a SaaS observability platform built for cloud native environments. Given how mission critical observability is, we have a high SLA and take reliability incredibly seriously. Our Kubernetes footprint spans multiple regions, with thousands of Kubernetes nodes in total. These clusters run a mix of stateless and stateful workloads, but the largest stateful workload is our metrics datastore.”
From there Matt and Peter walked through a discussion on time series database (TSDB) architecture, stateful operations, Prometheus, and cross-cluster migration use cases. Matt shared that “in terms of why you’d want to move a stateful workload between clusters, there can be a variety of reasons.” For example, many organizations start off with just one cluster, especially for production use cases. At some point that cluster will grow too large, and you may want to split it up whether for scalability or blast radius reasons.
Be sure to watch the replay when it’s available.
ICYMI — Chronosphere KubeCon 2022 action shots – Day One
On the lighter side, enjoy are some sights from around the show on day-one KubeCon 2022:
Tuesday Oct 25, 2022: Prometheus Day
Prometheus Day is shaping up to be a busy day – ranging from our participation in a documentary on Prometheus to keynotes and sessions with customers like Robinhood and Doordash to sharing some big news around Prometheus.
News! Chronosphere and PromLabs donate PromLens to Prometheus organization
We announced today we partnered with the co-founder of Prometheus, Julius Volz, to break down barriers and make the process of open-source adoption easier for engineers. Chronosphere and PromLabs will donate PromLens to the Prometheus Organization, making it free for anyone to use as a standalone query building app. With PromLens, users can:
✅ Edit confidently
✅ Build visually
✅ Debug and fix any PromQL query
✅ Gain X-Ray data
✅ Detect hints and actions
Check out the blog by our co-founder and CEO Martin Mao for more details about our PromLens news.
Keynote featuring Chronosphere and our customer, Robinhood
Reality check: Is it time to raise your metrics game?
The first keynote of the day at Prometheus Day was by Martin Mao, co-founder and CEO of Chronosphere together with Yash Kumaraswamy, Senior Staff Engineer from Robinhood and it was a powerful 10 minutes. Yash and Martin tagged teamed nicely to feature Robinhood’s experiences with developing great ROI while raising their metrics game across the organization. Yash specifically identified microservice details around generating metrics. Finally, they announced the fully open sourcing of PromLens, the upstream project that has been in Chronosphere’s platform for some time now. Together with creator Julius Volz it’s available on Github under the Prometheus organization with full instructions on how to build it from scratch. Now we can all up our PromQL game!
More points mentioned in the talk include:
✅ Robinhood’s journey with metrics
✅ The close correlation with availability and MTTR
✅ Robinhood’s investments that led to higher returns and availability.
“Whether or not to raise your metrics game is really thinking about the return on investment into your metrics – how much value are you getting out of it, and how much are you putting in?”Martin Mao, co-founder and CEO of Chronosphere
Session featuring Chronosphere and our customer, DoorDash
Centralized vs. Decentralized Prometheus, Scraping Architecture with DoorDash
Rabun Kosar, Infrastructure Software Engineer with DoorDash, and Ales Koprivnikar, Senior Sales Engineer with Chronosphere, dove deep into how DoorDash handles metrics at scale. Their session offered the audience a better understanding of DoorDash’s decision to use a decentralized approach for metrics collection in general, but also why it chose to deploy centralized collectors for specific use cases. They also spoke about how Chronosphere’s observability platform enables them to use a hybrid approach.
In this keynote, Rabun and Ales touched on:
✅ What collecting data today looks like
✅ DoorDash’s strategy for improving commerce experience
✅ The importance of distributed architecture, especially with Kubernetes
They also answered some probing questions from the audience about solving the challenge of metrics loss (read more about that in the DoorDash case study here) and DoorDash’s Journey From StatsD To Prometheus (which is covered in the Friday 11am session).
“In order to support the community infrastructure, we need to have a stable infrastructure within our system. When DoorDash infrastructure was smaller, we were using StatsD. However, our infrastructure was growing much faster. We had a central, massive system. It was a single point of failure, and we had operational failure. We needed a better, scalable solution as we grew.”Rabun Kosar, DoorDash
Documentary on origins of Prometheus
Inside Prometheus: An Open Source System That Changed Technology
Today was also the day everything you need to know about Prometheus came to be known. We participated in a unique documentary about the origin and mass adoption of Prometheus. It features voices of pioneers, engineers, and executives in the Prometheus community – including our co-founder and CTO Rob Skillington and co-founder of Prometheus, Julius Volz. Definitely set aside < 30 minutes and give it a watch — you’ll find out about the observability problem they set out to solve and the role the open source community played in the success of this Herculean project.
We’ll be back tomorrow with updates from day one of the main event – KubeCon 2022! For now enjoy some sights from around the Huntington Place conference center.
Monday Oct 24, 2022: Observability Day
We dove right in with our very first keynote, Distributed Tracing – The Struggle is Real, with Chronosphere Field CTO, Ian Smith. Distributed tracing has the potential to solve so many cloud native challenges – so why is widespread adoption lagging?
In this keynote, Ian explored:
✅ The muddy distributed tracing waters, and why it’s not working
✅ Why everyone is struggling with distributed tracing
✅ What distributed tracing can really do, and how to attract value
“The way that we need to be thinking about distributed tracing is going to be the answer that allows us to scale that complexity back into a human perspective. Metrics are a really good way of allowing us to comprehend huge amounts of data. Distributed tracing offers us that same potential, because of its inherent connections.”Ian Smith, Chronosphere Field CTO
After the close of our very first keynote, our team celebrated by stunting some limited edition O11Y Legend sneakers, and capturing some photos of our booth offerings. We will be back tomorrow to update you on all things Prometheus Day.
Chronosphere’s KubeCon full week tick-tock
Stop by our Booth #G15
Don’t be shy! Come check out our decked-out booth, talk to the booth crew about Chronosphere’s new and delightful features, and sit it on a demo.
Are you a gamer? To make this year’s KubeCon extra special, we’ve created our very own O11Y Legend Video Game game. In this race to the clouds, players will collect containers and build ladders to be the first to reach the platform. Top three online players will win a $100 Steam gift card and ALL players will be entered to win a $50 Steam gift card.
Keynotes and sessions galore! Make sure to check out all of the Chronosphere hosted sessions and events happening this week:
Monday: Open Observability Day
✅ Distributed Tracing – The Struggle is Real
When: Monday, October 24
Tuesday: Prometheus Day
✅ Reality check: Is it time to raise your metrics game?
When: Tuesday, October 25
✅ Prometheus documentary debut
When: Tuesday, October 25
Watch it here
✅ Centralized vs. Decentralized Prometheus Scraping Architecture with DoorDash
When: Tuesday, October 25
Wed-Fri: KubeCon + CloudNativeCon North America 2022
✅ Multi-Cluster Stateful Set Migration: A Solution To Upgrade Pain
When: Wednesday, October 26 | 11:55 AM
✅ DoorDash’s Journey From StatsD To Prometheus With 10 Million Metrics/Second
When: Friday, October 28 | 11:00 AM
Where: Register here
✅ Take a chance! Enter our 4:30 pm daily drawing for a chance to win an Atari 2600 Lego Set. And while you’re at it, enter to win the grand prize drawing—a $3000 Ultimate Gaming Setup.
✅ Booth crawl: Just one of Wednesday’s many highlights. Be sure to include our Booth #G15 on your route1