Companies look to improve cloud native resiliency with SRE says 451 Research 

Green Technology Image preview card
ACF Image Blog

Cloud native environments require resiliency. Read a recent study from 451 Research to learn how site reliability engineering helps support this goal.



Paige Cruz
Paige Cruz | Senior Developer Advocate | Chronosphere

Paige Cruz is a Senior Developer Advocate at Chronosphere passionate about cultivating sustainable on-call practices and bringing folks their aha moment with observability. She started as a software engineer at New Relic before switching to Site Reliability Engineering holding the pager for InVision, Lightstep, and Weedmaps. Off-the-clock you can find her spinning yarn, swooning over alpacas, or watching trash TV on Bravo.


Eight in ten respondents confirmed their organization has or is considering the addition of site reliability engineering (SRE) in a recent report by 451 Research, part of S&P Global Market Intelligence: The growing influence and benefits of site reliability engineering – Highlights from VotE: Cloud Native. Here’s why that 80% matters, and if your organization isn’t already embracing SRE, why it’s time to start.

Customer experience matters in digital business. Exceptional engagement — powered by reliable, highly available, and performant software and systems — keeps consumers coming back and developers innovating.

Cloud native initiatives are foundational to providing differentiated experience which is why they are on the rise — upwards of 95% of new digital workloads are expected to be deployed on cloud native platforms by 2025. A key requirement for cloud native, according to 451 Research, is resiliency, and the intelligence firm sees the addition of SRE by organizations as a way to help “modernize and automate their environments.”

A second popular approach to meeting resiliency goals, cited by 73% of respondents in the survey, is outsourcing and/or managed services. This can help teams focus their valuable technical staff on business-critical challenges rather than keeping systems functional.

The rise of site reliability engineering 

A dozen years ago, Google came up with the SRE role to address increasing IT operational challenges, including system complexity, consistency, and uptime running at scale. The Google-specific discipline and later variants and adaptations have been delivering value to organizations across industries — in an era of accelerated software delivery powered by cloud computing, microservices, containers, and continuous integration/continuous (CI/CD) delivery pipelines. As a proof point, only 14% of surveyed organizations with more than 1,000 employees are not currently considering adding an SRE. For more detail about the SRE role and its impact, a recommended read is: Experts weigh in on the state of SRE.

Unsurprisingly, the companies 451 Research surveyed expect existing and new SRE processes to “provide a range of benefits, including improved communication between developer and infrastructure teams and improved cost efficiency.”

Existing SRE professionals pay dividends 

Whether they are internally trained infrastructure professionals, newly hired, or a combination of the two, SRE employees primarily focus on “new capabilities or automating manual processes,” 451 Research finds.

“Almost four-fifths of respondents claim their SRE team spends more than 50% of their time automating and optimizing their environments and less time doing traditional operations work,” according to the research.

Respondents also say the hands-on work of their existing SRE employees is already benefiting their organizations in three key ways:

  • Improving cost efficiency (46%)
  • Improving communications between developers and operations (43%)
  • Implementing changes to prevent recurring outages (39%)

Yet organizations can easily burn SRE professionals out by making them responsible for both troubleshooting critical incidents and taking care of day-to-day tasks without optimal tools. In a cloud native environment, in particular, their solutions should include more efficient ways to handle data at scale; know about, triage, and remedia

Survey reveals SRE teams prefer purpose-built tools 

Because cloud native environments are highly dynamic and containers are ephemeral, a majority of respondents prefer cloud native tools not only for backup and disaster recovery but also for observability.

“Organizations are also investing in implementing end-to-end observability to proactively create alerts when problems arise and to use these tools for root-cause analysis,” according to 451 Research.

The right observability platform can be a winning solution for SRE teams to reduce downtime, remediate issues, and eliminate problems before they impact customers.

If your company is looking to boost cloud native initiative resiliency or just getting started on its SRE journey, Chronosphere delivers a cloud native observability platform that improves outcomes while reducing costs. Chronosphere provides deep insights into every layer of your stack — from infrastructure to applications to the business. The Chronosphere platform reduces customer observability data volumes by 60%, on average, while improving key metrics such as time to detection and time to remediation.

With Chronosphere, engineers spend 50% less time troubleshooting on average. That makes SREs happier and more productive.

Discover more insights by downloading the 451 Research report.

Share This:
Table Of Contents
Most Recent:

Ready to see it in action?

Request a demo for an in depth walk through of the platform!