Measuring developer productivity requires nuance. Read about how to balance quantifiable and qualitative goals when evaluating developers.
On: Nov 14, 2023
Paige Cruz is a Senior Developer Advocate at Chronosphere passionate about cultivating sustainable on-call practices and bringing folks their aha moment with observability. She started as a software engineer at New Relic before switching to Site Reliability Engineering holding the pager for InVision, Lightstep, and Weedmaps. Off-the-clock you can find her spinning yarn, swooning over alpacas, or watching trash TV on Bravo.
Are your developers as productive as they could be? Is this question even answerable? And – perhaps most importantly – who’s asking, and why?
Lots of people say, “yes,” developer productivity can and should be measured. Organizations have invented many frameworks for measuring developer productivity, ranging from DORA to SPACE to all sorts of proprietary ones – including one by consulting firm McKinsey & Company that was introduced in late August 2023, and which ignited a storm in the DevOps community.
But many observers argue that measuring the wrong thing, or applying these metrics frameworks in the wrong way, not only doesn’t give you useful information about your development team but can actually have the effect of making them less productive (and more likely to either try to game the system or walk away).
That’s because development work is complex and multifaceted and can’t be easily calculated with the black-and-white kinds of metrics desired by many senior, non-technical executives – such as the way salespeople can be evaluated by revenue or recruiters by the number of successful hires. Numbers simply won’t tell the whole story. Qualitative assessments are also necessary, as mounting evidence points to a direct relationship between developer well-being and productivity.
In this blog, we review the types of developer productivity measurements commonly used today. We delve into the backlash against placing developers into a surveillance culture and point out the pitfalls of various popular metrics. Finally, we describe how modern observability removes some of the top barriers to productivity that developers face with today’s cloud native environments and actually improves developer productivity.
Let’s start by defining key terms. What is developer productivity? Generally, it refers to the ability of a developer team to efficiently and consistently write and deploy high-quality code that delivers value to the business.
You see right away all the questions that might arise:
Then there’s the fact that developers are humans, not machines. They are generally not happy to be put under a microscope, or to have the work they do reduced to numbers. That’s why any measuring of developer productivity must include qualitative assessments as well as quantitative ones.
To begin answering these and other questions, it’s important to understand that four different aspects of work – any type of work – can be quantified.
The two most common metric frameworks in use for measuring developer productivity are DORA and SPACE.
Named for Google’s DevOps Research and Assessment (DORA) team that created them, the DORA standards measure outcomes. The DORA group identified four metrics for DevOps teams, with the goals of both improving developer efficiency and being able to communicate results that will have meaning for business leaders.
The four metrics are divided into two buckets – velocity and stability – because both are integral to DevOps to make sure teams don’t over-emphasize speed over quality.
DORA metrics are used to classify teams as elite, high, medium, and low performing, with the objective of using these classifications to drive improvements. According to Google’s internal measurements, elite teams are twice as likely to meet or exceed their organizational performance goals than teams in other performance categories.
A second portfolio of measurements is called the SPACE metrics (the acronym stands for satisfaction and well-being, performance, activity, communication and collaboration, and efficiency and flow). SPACE was co-developed by GitHub and Microsoft to bolster the DORA framework, which was perceived as lacking focus on the admittedly difficult-to-quantify state of developer happiness.
Either as part of DORA or SPACE, or as standalone metrics, the following are also used by organizations to measure developer productivity:
Then there are the new McKinsey metrics that have the developer community up in arms. McKinsey is not the first – and won’t be the last – to believe that DORA and SPACE don’t go far enough. McKinsey says its methodology complements DORA and SPACE with new opportunity-focused metrics, pointing out that they are necessary because software development is changing so rapidly due to generative AI tools such as ChatGPT. McKinsey’s own research found that such tools have the potential to enable developers to complete tasks up to two times faster.
Some of the new metrics McKinsey proposes include a “developer velocity index benchmark,” “contribution analysis,” and “talent capability scores” – each of which would increase scrutiny of both team and individual productivity.
Developer and father of extreme programming Ken Beck wrote on LinkedIn, “The report is so absurd and naive that it makes no sense to critique it in detail.” In a later post, he added, “Why would I take the risk of calling out a big, influential organization. It’s because what they published damages people I care about. I’m here to help geeks feel safe in the world. This kind of surveillance makes geeks feel less safe.”
Gergely Orosz, who blogs under “The Pragmatic Engineer,” site co-wrote a two-part rebuttal to the McKinsey article with Beck. One of the things the authors concluded was that it was certainly a worthy goal to try and make development teams more accountable to the business, in the same way that sales and human resources (HR) teams are.
But to help developers become more productive – without causing harm – the goal has to be to develop and sustain high-performing teams, which Orosz and Beck defined as “teams where developers satisfy their customers, feel good about coming to work, and don’t feel like they’re constantly measured on senseless metrics.”
The problem with the wrong metrics – or misapplying the right ones, say Orosz and Beck, and others who weighed in – is that the very fact of measuring invites developers to change how they work so that they win against the system. Start judging your developers on how many lines of code they produce and you’ll get plenty of code – but quality may well suffer.
Tech journalist Bill Doerrfeld, blogging at DevOps.com, agreed, pointing to what British economist Charles Goodhart wrote, which Doerrfeld summarized as “when a measure becomes a target, it ceases to be a good measure.” This can cause overall developer culture as well as quality to deteriorate.
So leaders must be very clear on what the real targets of developer productivity are for their businesses. Do you want higher-quality code that makes an impact? Then do your best to measure those things. As a case in point, Google analyzed developer inputs and outputs on a broad range of parameters and found that improved code quality correlated with increased developer productivity.
Generally speaking, most savvy CTOs don’t try to measure the productivity of individuals. There are many reasons for this, but most industry observers – not to mention developers themselves – believe that a successful DevOps organization is not just a group of individuals who work independently, but a cohesive team that together produce valuable products and services to the business.
Developers are constantly collaborating and interacting, and much of this cannot be measured because of the interdependencies and nuances. For example, some team members might not produce a lot of code on their own, but they are invaluable to their colleagues because of their help, advice, and expertise.
Team productivity, on the other hand, is much more visible, for all the reasons discussed in this blog. Managers or HR professionals that want to assess individual performance for annual reviews or other employment milestones should invest in developing organizational best practices for people management, such as having regular one-on-one meetings between managers and team members; soliciting anonymous feedback from all team members; and encouraging individuals to exercise personal accountability.
Much of this is based upon the culture of the DevOps team, rather than any systemic approach to track productivity.
As previously mentioned, calculating numbers alone isn’t enough. Organizations need qualitative measures in addition to quantitative ones.
The issue with depending on inputs, or efforts, such as hours worked, is that it encourages the wrong behaviors. If the company culture is to value – and reward – hours spent in front of a screen, developers will almost certainly put in the hours, but of what quality will the work be when it is delivered? In more toxic environments, it can even turn into a competition over who comes in earliest and stays latest. In such cases, developers are likely to produce empty, or even negative, work, and accomplish less than they otherwise would have done.
Some of the worst metrics fall into this category, such as counting lines of code or commits. A line of code that doesn’t achieve the purpose it’s meant for is worth nothing. And gaming a measurement like that is easy, as developers can churn out lines of indifferent code quite quickly. Any output metrics need to be in context, not treated as standalone truths.
The challenge with these two types of metrics – outcome and impact – is figuring out how responsible the DevOps team is for the outcome or impact in question. As Orosz and Beck point out, if you try to measure increased profits for the business, it’s nearly impossible to attribute such a rise to the developers only. However, of all possible metrics, these are possibly the closest to reflecting business goals – which is ultimately the point of measuring developer productivity after all.
Developer productivity is influenced by a broad range of factors. Here are some of the most important levers you can manipulate to improve it.
For companies operating in cloud native environments, investing in the Chronosphere cloud native observability platform is one proven way to boost the productivity of DevOps professionals.
A Market Insight Report from 451 Research, part of S&P Global Market Intelligence, titled “Chronosphere aims to improve team productivity and manage usage on its cloud native observability platform” details how businesses can benefit from the latest features in observability tooling to improve developer productivity.
Here are four ways that Chronosphere is moving productivity forward:
Measuring efforts/inputs can be useful to figure out how your team members work, but creating targets around them is almost guaranteed to backfire and both hurt productivity as well as team satisfaction. Instead, you want to have outcomes/impacts as the center of gravity for targets, and when measuring efforts/inputs they should only ever be used as context around how the team is achieving those targets, rather than turned into targets themselves.
In other words, the number of PRs/lines of code per PR is irrelevant on its own, but if you notice fewer bigger PRs corresponds to an increased rate of changes causing incidents, it might be a good thing for the team to reflect on.
It’s worth noting that the ultimate goal is not just to “be more productive” in a vacuum, but to produce valuable, high-quality software in a sustainable way. A best-of-breed observability solution can help do just that.
Request a demo for an in depth walk through of the platform!