If you’re a software leader, you probably have goals around continuous improvement. But often teams operate in a vacuum. A common question we get asked is “How are others doing?” and “What does good look like?”
Reporting on your organization’s software delivery performance without industry performance context makes it hard to convert anecdotes to meaningful yardsticks. As a software leader, you need to help the rest of your business understand what’s good and what should be expected from your organization -- but the referenceable data hasn’t been available.
Today, we launched the first-ever, real-time Engineering Performance Industry Benchmarks —accessible to everyone for free as our commitment to improving how organizations deliver software. These five benchmarks, available as a live dashboard, represent answers to some of the most commonly asked questions about software delivery, sourced from Allstacks’ anonymized, global data set. Software leaders and teams will now have a benchmark to contextualize their own performance against and set meaningful goals and expectations for their teams.
To help understand the breadth and depth of what metrics and benchmarks are available, we’re sharing the metrics included in the benchmarks represented by the commonly asked questions by software organizations.
The 5 Metrics in the Engineering Performance Industry Benchmarks
- How many days per week do teams write code?
- What percentage of issues are planned vs. unplanned?
- What are the cycle times by issue type?
- What types of code do teams produce?
- What is the average commit size for companies?
Read on for more details on each...
How many days per week do teams write code?
Every software team operates differently, but ultimately, we want to understand how much time our engineers have to develop software each week.
The Coding Days metric helps teams understand if engineers have sufficient focus time or if they even have time to code at all. It can also help them determine if they are stuck in other activities like meetings which can be non-productive. Leaders can use the industry trends for Coding Days to understand the baseline for week-to-week productivity and capacity to compare their organizations against.
What percentage of issues are planned vs. unplanned?
The Planned vs. Unplanned metric shows the distribution of work between planned work and interrupting work. This is a signal of a team's capacity and of potential risk indicators to strategic initiatives. This demonstrates how much of our delivered work we could plan vs. things brought to the team, like escaped defects, one-off feature requests, and other, often uncoordinated work.
What are the cycle times by issue type?
Cycle Time by Issue Type illustrates how we approach and complete different types of work. Not every type of work is created the same, and often some types of work necessitate faster cycle times than others. Cycle Time by Issue Type helps by breaking down the types of issues being worked on and how long the team takes to resolve each respective type.
With this, you can understand your organization's capacity to address different types of issues while highlighting areas you might be spending too much or too little time. You can also use this to manage stakeholder expectations on delivery timelines or effort, given the type of work being addressed.
What types of code do teams produce?
Different tasks result in other behaviors in the codebase. When net new features are being developed, we expect to grow and add to the code base, categorized in this metric as "New Work." Sometimes the work to develop new features can be challenging, resulting in code being rewritten in a short period or "churn." While some churn is an expected artifact of the iterative process of solving complex problems, too much churn can be a sign of developers spinning their wheels and could lead to downstream maintainability challenges.
Maintaining a healthy ratio of New Work to Churn is a helpful indicator of future maintainability and hygiene. An increase in "legacy refactoring" (rebuilding older parts of the codebase) can be a leading indicator of escaped defects.
What is the average commit size for companies?
Small and frequent commits help teams iterate quickly on features, effectively review code, and drive overall maintainability in the codebase. When significant, infrequent commits surface, it can be challenging for teams to adequately review code leading to downstream defects and maintainability nightmares. It also becomes challenging to help the rest of the team understand what is being developed. Knowledge transfer can become stifled, leading to a high bus-factor.
Trends from the last twelve months
Here are some trends we see in the data, which is particularly interesting as the state of work has shifted over the last twelve months:
- Teams are coding 5.8% fewer days compared to the prior 12 months.
- Teams are planning 12% more of their work compared to the prior 12 months.
- Bugs are taking 19.4% longer to resolve compared to the prior 12 months.
You can see how changes in behavior impact delivery. These insights ultimately lead to a better understanding of where risks and opportunities for improvement are.
You can see other work pattern trends in our "Covid-19 & the Impact to Engineering Productivity" report.
Engineering Performance Benchmark FAQs
(Ok, no one asked, but we think you might)
Why did we choose these particular metrics?
We’ve spoken to thousands of engineering and product leaders over the last several years. While many organizations have unique workflows, each needs to set targets based on how their organization is doing relative to the industry at large. The metrics represented in the Engineering Performance Benchmark help teams understand their behaviors while accounting for different workflows and unique behaviors. The data is independent of an organization or team’s size, workflow, and development environment, making these metrics universally applicable for self-assessment.
What is the source data for the benchmarks?
The data in the benchmarks is an aggregated, anonymized global data set from our user base and is growing every day as more organizations come on board. Today the Benchmarks are built from 4 core types of data and their respective metadata:
- +10,000,000 Commits
- +1,000,000 PRs
- +3,000,000 Issues
- +250,000 Contributors
While you do not need to be a user of Allstacks to access the benchmarks, you can easily contribute to the data set. By starting a free trial of Allstacks, you can help improve our collective Benchmarks while evaluating the Allstacks Value Stream Intelligence platform for your organization. Connecting your tools is quick, and our SOC2 Type II platform keeps your data safe and anonymous. You can read up on our security and data privacy standards here.
Without the collective industry, this data and the insights in the benchmarks wouldn’t be possible. We’re honored to present the first set of live industry benchmarks enabling our peers to invest in continuous improvement.
How frequently is it updated?
Daily. The best part of the benchmarks being hosted on our platform rather than a static data analysis is that the metrics are being updated every day. We know that the environment around building software is complex and ever-changing, which requires up-to-date metrics. The benchmarks are updated daily, reacting to the world at large, so you can see how the industry is performing even as major external factors manifest.
See how you stack up.
You can access the Benchmarks right now. We'd also love to show you a personalized demo and get your started with a free trial so you can see these metrics and trends with your own data. You can schedule a demo here.