DORA Metrics: Measuring Software Delivery Performance

It’s essential for DevOps teams to continuously strive to improve their performances to find success. Learn all about DORA metrics here!

By Kate Dagher  •   November 21, 2022  •   7 min read

Any company that has a software component needs to prioritize acting quickly to accommodate ever-changing customer needs, while still delivering strong and reliable services to their customers. To meet these evolving requirements, it’s essential for engineering leaders to identify means for the DevOps (a combination of software development and operations) team to continue improving the way it works. While there are many ways of evaluating a software team’s performance, DORA metrics have become popular in recent years because they produce especially insightful and helpful evaluations of a software engineering team’s performance. DORA metrics enable leaders to review objective data that measures the performance of software delivery teams so they can drive product and service improvements. 

This article will cover what DORA metrics are; break them down for you; explain the purpose, benefits, and challenges of using DORA metrics; and provide a means to calculating the metrics. If you’re ready, let’s dive in. 

What are DORA metrics? 

In 2018, a research program called the DevOps Research and Assessment (DORA) team was acquired by Google. The principal goal of the team is to understand what enables teams to realize high performances in software delivery. The DORA team has established a set of metrics that can accurately evaluate a team’s performance. These DORA metrics include deployment frequency, lead time for changes, change failure rate, and mean time to restore service. 

1Deployment frequency

The deployment frequency (DF) indicates the frequency that a company deploys code for a given application, meaning how often there are successful software releases to production. This metric looks at the total number of deployments per day as an indication of DF and measures and controls the batch size of inventory that a given organization delivers. Typically, the most successful organizations are likely to have much more frequent and much smaller deployments, whereas high-performing organizations tend to release several deployments per day. The standard DF is around one deployment per week. 

Build a culture of effective meetings with your engineering team

Level up your engineering meeting habits to boost engagement and productivity with a collaborative meeting agenda. Try a tool like Fellow!

2Lead time for changes

The lead time for changes (LT) metric refers to the time between a code change and the amount of time required to get it to its deployable state. In other words, LT measures the amount of time needed for committed code to reach production, therefore evaluating the velocity of the team’s software delivery. This metric gives engineering leaders a better understanding of the DevOps cycle time and identifies how increased requests may be handled. Elite and high-performing teams have a very low lead time for changes, where the average time from start to finish of a product is averaged as an indicator of the team’s overall performance. 

3Change Failure Rate

The change failure rate (CFR) measures the number of changes that were made to a code which then resulted in any kinds of incidents or production failures. This metric therefore refers to the quality and stability of the software. High-performing teams tend to have about a 0-15% CFR, with elite teams scoring closer to 0%. When tracked over time, this is an effective metric for understanding how much time the team spends fixing errors versus delivering new code. 

4Mean Time to Restore Service

Mean time to restore service (MTTR) refers to the time that is required for a service to come back from some kind of failure. In software engineering, unplanned outages and incidents happen often, no matter how great the DevOps team is. To continue performing at a high level, it’s essential that teams are able to quickly recover or restore a system that’s down. Shorter recovery times are typically seen in elite and high-performing teams, and engineering leaders tend to feel more comfortable giving their team space to experiment or innovate when they know the MTTR is short. 

The purpose of measuring DORA metrics

The purpose of measuring DORA metrics is to truly understand the software engineering process and identify areas of improvement based on data-driven metrics that can be objectively measured and tracked over time. Tracking DORA metrics allows engineering leaders to empower and motivate their teams because these metrics identify exactly which areas of the business require attention, direction, and improvement. As your DORA metrics begin to improve, you can know for certain that you’re making the right decisions for the team and for your customers’ satisfaction. By improving your performance and providing more value to your customers, you become a stronger competitor and create space to try delivering new and innovative ideas. 

The benefits of DORA metrics 

DORA metrics are beneficial for a few reasons. First, they support effective decision making in that the DevOps team can identify and establish trends that allow them to make decisions that are informed by metrics and objective data. This will help inspire positive changes in the way the team delivers to its customers. DORA metrics also help deliver higher value because these metrics help teams identify what performance level is required to reach business objectives and goals, while also highlighting what kind of value is being delivered to the customers. Lastly, DORA metrics help motivate the software engineering team because when performance is being measured, people who feel responsible for a given metric will likely make changes to their behaviors and actions to do what they can to help improve the metric. This increased motivation improves collaboration and helps the team reach its goals more effectively. 

The challenges of DORA metrics 

While there are many benefits to using DORA metrics, these metrics also pose a few challenges of which you should be aware. First, using DORA metrics means that, in some cases, the data is scattered in different sources across the IT landscape; this can make it difficult to remain organized and can negatively affect time management. Next, data is only available in raw format, so the data extraction process is more time consuming and complex. What’s more is that the data needs to be transformed and combined into calculable units to ensure objectivity in calculating each metric. Lastly, while it’s often interpreted that the quicker, the better within the DORA metrics framework, this mindset may omit measuring the quality of the code or the product. It’s essential that sufficient time and care is taken in extracting and calculating each metric and that each metric be interpreted within its appropriate context. 

Calculating the metrics 

When measuring deployment frequency, lead time for changes, change failure rate, and mean time to restore service, organizations can measure whether their teams are elite, high, medium or low in performance and in meeting their business objectives. 

Elite

Elite teams are about twice as likely to meet or exceed their organizational performance goals because they have an extremely high DF (which includes multiple deploys per day), have a very short LT (for example, 1-2 days), take only hours at most to restore service, and have a non-existent or extremely low CFR (for example, 0%-5%). 

High

A team falls into the high-performing cluster when its DF for the primary application or service that it works on is in high demand, meaning there are multiple deploys per day. A high-performing team also has a fairly quick LT—this means that for the primary application or service that the team is working on, it takes between one day and one week to go from coding to successfully running in production. A team in this category also takes less than one day to restore service when a service incident or defect occurs. Lastly, a high-performing team only has about a 0%-15% CFR. 

Medium

A team is categorized in the medium cluster when its deployment frequency is between once per week and once per month. A medium-performing team likely has an LT that is between one week up to one month, takes between one day and one week to restore service (if there is an incident or defect), and has between a 16%-30% CFR. 

Low

Low-performing teams have a DF that falls between once per month and once every six months; their LT is likely to be around one month to six months; their MTTR falls around one week and one month; and their CFR ranges from 46%-60%. 

Parting advice 

While DORA metrics pose some challenges in their adoption and use, overall their implementation seems to be the most effective way to both visualize and measure the performance of DevOps and engineering teams. That said, it’s important that each metric is evaluated in its given context and interpreted with care to truly understand the underlying practices which are either effective or ineffective for the team. For DORA metrics to be truly effective, they need to be one part of various value stream management efforts.

  • shopfiy
  • uber
  • stanford university
  • survey monkey
  • arkose labs
  • getaround
  • motorola
  • university of michigan
  • webflow
  • gong
  • time doctor
  • top hat
  • global fashion group
  • 2U
  • lemonade
  • solace
  • motive
  • fanatics
  • gamesight
  • Vidyard Logo