A starting guide to measuring developer productivity

A businessman is pointing at a clock icon on a virtual screen, emphasizing the importance of measuring developer productivity for effective project management.
ACF Image Blog

Measuring developer productivity requires nuance. Read about how to balance quantifiable and qualitative goals when evaluating developers.

Paige Cruz
Paige Cruz | Senior Developer Advocate | Chronosphere

Paige Cruz is a Senior Developer Advocate at Chronosphere passionate about cultivating sustainable on-call practices and bringing folks their aha moment with observability. She started as a software engineer at New Relic before switching to Site Reliability Engineering holding the pager for InVision, Lightstep, and Weedmaps. Off-the-clock you can find her spinning yarn, swooning over alpacas, or watching trash TV on Bravo.

16 MINS READ

Are your developers as productive as they could be? Is this question even answerable? And – perhaps most importantly – who’s asking, and why?

Lots of people say, “yes,” developer productivity can and should be measured. Organizations have invented many frameworks for measuring developer productivity, ranging from DORA to SPACE to all sorts of proprietary ones – including one by consulting firm McKinsey & Company that was introduced in late August 2023, and which ignited a storm in the DevOps community.

But many observers argue that measuring the wrong thing, or applying these metrics frameworks in the wrong way, not only doesn’t give you useful information about your development team but can actually have the effect of making them less productive (and more likely to either try to game the system or walk away).

That’s because development work is complex and multifaceted and can’t be easily calculated with the black-and-white kinds of metrics desired by many senior, non-technical executives – such as the way salespeople can be evaluated by revenue or recruiters by the number of successful hires. Numbers simply won’t tell the whole story. Qualitative assessments are also necessary, as mounting evidence points to a direct relationship between developer well-being and productivity.

In this blog, we review the types of developer productivity measurements commonly used today. We delve into the backlash against placing developers into a surveillance culture and point out the pitfalls of various popular metrics. Finally, we describe how modern observability removes some of the top barriers to productivity that developers face with today’s cloud native environments and actually improves developer productivity.

Some definitions to frame the discussion

Let’s start by defining key terms. What is developer productivity? Generally, it refers to the ability of a developer team to efficiently and consistently write and deploy high-quality code that delivers value to the business.

You see right away all the questions that might arise:

  • What is considered efficient?
  • How do you judge high-quality code?
  • Why focus on individual developers versus teams?
  • And how do you measure value to the business?

Then there’s the fact that developers are humans, not machines. They are generally not happy to be put under a microscope, or to have the work they do reduced to numbers. That’s why any measuring of developer productivity must include qualitative assessments as well as quantitative ones.

To begin answering these and other questions, it’s important to understand that four different aspects of work – any type of work – can be quantified.

  • Inputs: Some industry observers call this effort. In the world of software, this would involve how much time, energy, thought, and creativity has gone into development activities such as designing, coding, testing, and debugging?
  • Outputs: Tangible things that are delivered as a result of the inputs. These can include a requested software feature or the code itself, as well as any documentation.
  • Outcomes: What changes ensue in response to the inputs and outputs? Will employees do their jobs differently because key business processes have been re-engineered? Will customers change their behavior?
  • Impacts: What value accrues to the business? Are employees more efficient? Are customers buying more products?

How are businesses currently measuring developer productivity?

The two most common metric frameworks in use for measuring developer productivity are DORA and SPACE.

Using DORA to measure development outcomes

Named for Google’s DevOps Research and Assessment (DORA) team that created them, the DORA standards measure outcomes. The DORA group identified four metrics for DevOps teams, with the goals of both improving developer efficiency and being able to communicate results that will have meaning for business leaders.

The four metrics are divided into two buckets – velocity and stability – because both are integral to DevOps to make sure teams don’t over-emphasize speed over quality.

  • Deployment frequency: How frequently the team successfully releases code changes to production (this measures velocity).
  • Lead time for changes: How long it takes for a commit to get into production (measures velocity).
  • Change failure rate: Percentage of deployments causing a failure in production (this measures stability).
  • Time to restore service: The amount of time it takes an organization to recover from a production failure (measures stability).

DORA metrics are used to classify teams as elite, high, medium, and low performing, with the objective of using these classifications to drive improvements. According to Google’s internal measurements, elite teams are twice as likely to meet or exceed their organizational performance goals than teams in other performance categories.

SPACE goes for less-quantifiable assessments

A second portfolio of measurements is called the SPACE metrics (the acronym stands for satisfaction and well-being, performance, activity, communication and collaboration, and efficiency and flow). SPACE was co-developed by GitHub and Microsoft to bolster the DORA framework, which was perceived as lacking focus on the admittedly difficult-to-quantify state of developer happiness.

  • Satisfaction and well-being: Satisfaction and well-being are important dimensions of productivity. Surveys that ask developers such things as whether they would recommend their team to others, whether they have the right tools for their jobs, and if they were at risk of burnout, are the best way of gathering this data.
  • Performance: This is another hard-to-quantify aspect of software development. The closest measures are outcomes rather than outputs or even impacts because a developer might deliver a high volume of code, but it might not be of sufficient quality. Likewise, even high-quality code might not be enough to induce customers to change purchasing behavior. Often, performance evaluations come down to a binary question: does the code do what it was designed to do?
  • Activity: By simply counting outputs such as design documents, pull requests, commits,  builds, tests, or incident mitigations, you can get some sense of productivity, but these measures are limited as it is debatable whether there’s value in quantifying developers’ activities across entire DevOps environments — activity on its own isn’t really good or bad. Does a high volume of PRs automatically mean high productivity? Not if your team is making a lot of Pull Requests (PRs) to revert things or fix issues that made their way into production. Bottom line — activity numbers should never be used by themselves, out of context. Still, assessing different aspects of activity can add some data to the big productivity picture.
  • Communication and collaboration: Software cannot be developed without a great deal of effective communication and collaboration, both within and between teams. As another difficult-to-quantify attribute, communication and collaboration can be measured by proxies such as how quickly code is integrated, assessments of work review quality by team members, and onboarding time for new team members.
  • Efficiency and flow: Flow is an important concept for many developers, who describe it as being able to work without interruptions. Much literature has been devoted to suggesting ways that developers can optimize their flow. You can attempt to measure this by counting the number of handoffs required in a process, by surveys asking developers about their ability to stay in flow, by number of interruptions in a process, and other like metrics.

Some other commonly used developer productivity metrics

Either as part of DORA or SPACE, or as standalone metrics, the following are also used by organizations to measure developer productivity:

  • Cycle time: This is the time from first commit to production release – or from beginning to finishing work on an assignment. In general, shorter cycle times are considered better, but they shouldn’t be accelerated at the expense of quality.
  • PR size: A pull request takes place when a developer is ready to begin the process of merging new code changes with the project repository. This allows developers to create new features or fix bugs without affecting users or worrying about breaking the overall service or application.
  • Investment profile: This enables teams to visualize where they are spending their resources and time. This helps management do a better job of distributing work based on business priorities.
  • Planning accuracy: Planning accuracy is the ratio of how many story points were finished from the total planned for an iteration. This is a good metric for honing sprint planning.

Behold the backlash

Then there are the new McKinsey metrics that have the developer community up in arms. McKinsey is not the first – and won’t be the last – to believe that DORA and SPACE don’t go far enough. McKinsey says its methodology complements DORA and SPACE with new opportunity-focused metrics, pointing out that they are necessary because software development is changing so rapidly due to generative AI tools such as ChatGPT. McKinsey’s own research found that such tools have the potential to enable developers to complete tasks up to two times faster.

Some of the new metrics McKinsey proposes include a “developer velocity index benchmark,” “contribution analysis,” and “talent capability scores” – each of which would increase scrutiny of both team and individual productivity.

Developer and father of extreme programming Ken Beck wrote on LinkedIn, “The report is so absurd and naive that it makes no sense to critique it in detail.” In a later post, he added, “Why would I take the risk of calling out a big, influential organization. It’s because what they published damages people I care about. I’m here to help geeks feel safe in the world. This kind of surveillance makes geeks feel less safe.”

Gergely Orosz, who blogs under “The Pragmatic Engineer,” site co-wrote a two-part rebuttal to the McKinsey article with Beck. One of the things the authors concluded was that it was certainly a worthy goal to try and make development teams more accountable to the business, in the same way that sales and human resources (HR) teams are.

But to help developers become more productive – without causing harm – the goal has to be to develop and sustain high-performing teams, which Orosz and Beck defined as “teams where developers satisfy their customers, feel good about coming to work, and don’t feel like they’re constantly measured on senseless metrics.”

The problem with the wrong metrics – or misapplying the right ones, say Orosz and Beck, and others who weighed in – is that the very fact of measuring invites developers to change how they work so that they win against the system. Start judging your developers on how many lines of code they produce and you’ll get plenty of code – but quality may well suffer.

Tech journalist Bill Doerrfeld, blogging at DevOps.com, agreed, pointing to what British economist Charles Goodhart wrote, which Doerrfeld summarized as “when a measure becomes a target, it ceases to be a good measure.” This can cause overall developer culture as well as quality to deteriorate.

So leaders must be very clear on what the real targets of developer productivity are for their businesses. Do you want higher-quality code that makes an impact? Then do your best to measure those things. As a case in point, Google analyzed developer inputs and outputs on a broad range of parameters and found that improved code quality correlated with increased developer productivity.

What to measure: Team or individual developer productivity?

Generally speaking, most savvy CTOs don’t try to measure the productivity of individuals. There are many reasons for this, but most industry observers – not to mention developers themselves – believe that a successful DevOps organization is not just a group of individuals who work independently, but a cohesive team that together produce valuable products and services to the business.

Developers are constantly collaborating and interacting, and much of this cannot be measured because of the interdependencies and nuances. For example, some team members might not produce a lot of code on their own, but they are invaluable to their colleagues because of their help, advice, and expertise.

Team productivity, on the other hand, is much more visible, for all the reasons discussed in this blog. Managers or HR professionals that want to assess individual performance for annual reviews or other employment milestones should invest in developing organizational best practices for people management, such as having regular one-on-one meetings between managers and team members; soliciting anonymous feedback from all team members; and encouraging individuals to exercise personal accountability.

Much of this is based upon the culture of the DevOps team, rather than any systemic approach to track productivity.

Common mistakes to avoid when measuring developer productivity

As previously mentioned, calculating numbers alone isn’t enough. Organizations need qualitative measures in addition to quantitative ones.

Problems with input measurements

The issue with depending on inputs, or efforts, such as hours worked, is that it encourages the wrong behaviors. If the company culture is to value – and reward – hours spent in front of a screen, developers will almost certainly put in the hours, but of what quality will the work be when it is delivered? In more toxic environments, it can even turn into a competition over who comes in earliest and stays latest. In such cases, developers are likely to produce empty, or even negative, work, and accomplish less than they otherwise would have done.

Problems with output measurements

Some of the worst metrics fall into this category, such as counting lines of code or commits. A line of code that doesn’t achieve the purpose it’s meant for is worth nothing. And gaming a measurement like that is easy, as developers can churn out lines of indifferent code quite quickly. Any output metrics need to be in context, not treated as standalone truths.

Problems with outcome and impact measurements

The challenge with these two types of metrics – outcome and impact – is figuring out how responsible the DevOps team is for the outcome or impact in question. As Orosz and Beck point out, if you try to measure increased profits for the business, it’s nearly impossible to attribute such a rise to the developers only. However, of all possible metrics, these are possibly the closest to reflecting business goals – which is ultimately the point of measuring developer productivity after all.

Ways to improve developer productivity

Developer productivity is influenced by a broad range of factors. Here are some of the most important levers you can manipulate to improve it.

  • Nurture the right culture: This is probably the single most important factor to improve productivity. The environment should value and promote collaboration rather than competition. You want to promote the sharing of expertise and knowledge, and make processes and operations as transparent as possible to avoid misunderstandings. Work-life balance, reducing stress, and physical and mental health should all be priorities. Management should be supportive and provide sufficient resources with realistic timelines, appropriate task assignments, and constructive feedback. Finally, the work environment should be as free from distractions as possible.
  • Provide the right tools: High-quality and appropriate development tools, such as integrated developer environments (IDEs), debuggers, languages, and frameworks can have a huge impact on productivity. So can observability solutions when it comes to reducing troubleshooting and getting to root-causes fast. When possible, invest in platforms rather than disparate point solutions that require separate (and often steep) learning curves.
  • Institute proven processes: Development methodologies such as Agile, Scrum, or Kanban, if implemented effectively, can streamline the development process, and significantly improve productivity.
  • Deploy automation where appropriate: Automated testing, continuous integration, and continuous deployment can reduce time spent on repetitive tasks.
  • Offer ample training and learning opportunities: Providing developers with the resources to learn and grow can also contribute to long-term increases in productivity.
  • Emphasize code quality: By stressing quality over quantity or (even) velocity, code will be easier to maintain, decreasing technical debt and making future changes easier, thus increasing long-term productivity.

Four ways Chronosphere observability helps improve developer productivity

For companies operating in cloud native environments, investing in the Chronosphere cloud native observability platform is one proven way to boost the productivity of DevOps professionals.

A Market Insight Report from 451 Research, part of S&P Global Market Intelligence, titled “Chronosphere aims to improve team productivity and manage usage on its cloud native observability platform” details how businesses can benefit from the latest features in observability tooling to improve developer productivity.

Here are four ways that Chronosphere is moving productivity forward:

  1. Gives your developers the right cloud native observability tools for the way they actually work: By deploying tools that automatically present each team with just the data that is relevant to them, Chronosphere cuts out all the other noise. For modern environments organized with small, interdependent engineering teams, Chronosphere supports workflows aligned with how your distributed teams operate.
  2. Enables developers to analyze, refine, and operate observability data at scale: Chronosphere empowers DevOps teams to analyze their data to understand what is useful and what is waste. It enables them to shape their data to improve its usefulness and eliminate what they don’t need. All of this can be achieved without touching source code and redeploying, so it doesn’t slow down developers.
  3. Optimizes your team for both speed and performance: Chronosphere rapidly loads real-time dashboards and responds swiftly to queries so that less time is spent worrying about metrics and more is spent working on value-added activities. The goal is fast remediation – and always honing these steps to optimize performance.
  4. Ensures tooling is usable by all levels of engineers, not just power users: Too many observability tools are hyper-focused on power users and leave novice and casual users behind. The result? When an incident occurs, the on-call engineer can rarely solve the problem alone and must escalate it to more experienced – and much more expensive – senior DevOps professionals who should be working on more important problems rather than firefighting.

In summary: Yes, measure developer productivity – but cautiously

Measuring efforts/inputs can be useful to figure out how your team members work, but creating targets around them is almost guaranteed to backfire and both hurt productivity as well as team satisfaction. Instead, you want to have outcomes/impacts as the center of gravity for targets, and when measuring efforts/inputs they should only ever be used as context around how the team is achieving those targets, rather than turned into targets themselves.

In other words, the number of PRs/lines of code per PR is irrelevant on its own, but if you notice fewer bigger PRs corresponds to an increased rate of changes causing incidents, it might be a good thing for the team to reflect on.

It’s worth noting that the ultimate goal is not just to “be more productive” in a vacuum, but to produce valuable, high-quality software in a sustainable way. A best-of-breed observability solution can help do just that.

Share This: