Join Snap Inc’s Tech Lead, Evan Yin as he talks about how Chronosphere has helped Snap improve developer productivity, resolve incidents faster, increase cost efficiency, and more.
Launched in 2011, Snap, Inc. is a technology company that serves over 750 million daily customers worldwide with Snapchat, Spectacles, and Bitmoji. These products let consumers share experiences with friends, use VR/AR, and develop a social network avatar.
As they scaled, Snap’s in-house, open-source observability solution was getting expensive and time-consuming for engineers to realistically manage as the company saw massive user growth.
In order to be competitive in a crowded market and delight the millions of customers they serve, Snap must deliver best-in-class availability and performance to customers worldwide. However, the company’s previous observability setup wasn’t meeting expectations in terms of:
Snap has been a long-time partner of Chronosphere. Evan Yin, technical lead at Snap, says he discovered the technology when it was still a GitHub page for M3 – before Chronosphere’s official founding – as part of his research for a new observability solution. Two key goals for Snap for their new solution would be to improve developer productivity and cost efficiency.
According to Yin, Snap chose Chronosphere, because of its ability to control observability data volumes, provide high levels of availability at scale, as well as being suited for cloud native architectures (including Prometheus support). He also felt that his values about observability aligned well with the Chronosphere founder’s vision for the company, which provided a favorable foundation for such a partnership.
“Chronosphere is built to specifically address issues in the cloud native world. We can always rely on them to solve the problem,” he says.
Reduced costs: For observability at scale, working with Chronosphere reduced data volumes by more than half and saved thousands of engineering hours. Using Chronosphere’s Control Plane, the team defines what data labels are most important and which are noise, making it faster and easier to triage issues while significantly reducing costs.
“At Snap we’re religious about cost efficiencies, so when we built our observability system we asked ourselves: how can we reduce waste right from the beginning?”
Support for cloud native architecture at scale: As part of its infrastructure upgrade, Snap adopted Google Kubernetes Engine (GKE), due to its customization capabilities and flexibility. Chronosphere is built to oversee cloud native architecture and easily integrates with Google Cloud offerings. Due to Chronosphere’s ability to support large scale environments, Snap was able to scale their observability solution 5x – from 50 million time series to 250+ million time series to support all of the use cases they needed.
Improved observability reliability: Chronosphere offers an industry-leading 99.9% uptime SLA that has never been broken. Compared to the challenges with downtime and reliability that Snap experienced with their previous solution, this was a big improvement.
“Our developers [now] have the freedom to emit high cardinality metrics and load dashboards faster. They do not have to worry about metrics availability anymore,” said Yin.
Developer productivity: Since moving to Chronosphere, the central observability team has seen a 90% decrease in on-call pages. The team can now work on value-added tools for Snap – instead of just trying to constantly put out fires in the observability platform. On top of that, the rest of the Snap engineering team can be more productive with faster loading dashboards and queries and less time spent worrying about metrics.
“We always want our developers to be able to deliver the features faster or even just load their dashboards faster so that they can resolve the incident faster.”
Snap’s adoption of Chronosphere not only helped the company set itself up for future success; it also ensures application availability for its millions of daily users and made behind-the-scenes operations much smoother and more reliable.
“We’re trying to provide the best in class observability tools for the service owners at Snap so they can manage their services more smoothly and efficiently,” says Yin.
As an engineering team, Snap is experiencing with Chronosphere:
Learn more about Chronosphere and see it live in a 1:1 demo by scheduling a meeting with our expert team.