Want to learn more about Faros AI?

Fill out this form to speak to a product expert.

I'm interested in...
Loading calendar...
An illustration of a lighthouse in the sea

Thank you!

A Faros AI expert will reach out to schedule a time to talk.
P.S. If you don't see it within one business day, please check your spam folder.
Oops! Something went wrong while submitting the form.
Submitting...
An illustration of a lighthouse in the sea

Thank you!

A Faros AI expert will reach out to schedule a time to talk.
P.S. If you don't see it within one business day, please check your spam folder.
Oops! Something went wrong while submitting the form.

A 5th DORA Metric? Rework Rate is Here (And You Can Track It Now)

Discover the 5th DORA metric: Rework rate. Learn what it is, why it matters in the AI era, and how to start tracking it today. Get industry benchmarks, see what good looks like, and find practical tips to reduce wasted engineering effort and boost performance.

Thierry Donneau-Golencer
Thierry Donneau-Golencer
A semicircular gauge chart titled “Rework Rate.” The needle points to 22%, which falls in the red “Low” performance zone. The scale ranges from red (Low), to yellow (Medium), green (High), and dark green (Elite).
8
min read
Browse Chapters
Share
October 1, 2025

An evolution of the DORA metrics framework

Google Cloud has just published its annual DORA (DevOps Research and Assessment) report, with a strong focus on the impact of AI on software engineering. If you haven't seen it yet, check out our summary of key findings from the DORA Report 2025.

For years, the DORA framework has been synonymous with four key metrics: deployment frequency, lead time for changes, change failure rate, and failed deployment recovery time (MTTR). But the DORA Report 2024 marked a significant evolution of this framework and the 2025 report completed the picture.

What new metric was announced in the 2024 DORA report?

The metrics expanded to five, adding rework rate to the mix. However, no benchmarks were published at the time. The framework was also reorganized into two new categories:

  • Three throughput metrics: deployment frequency, lead time for changes, and failed deployment recovery time
  • Two instability metrics: change failure rate and rework rate
Performance Factor DORA Metric What It Measures
Throughput Lead time for change The amount of time it takes for a change to go from committed to version control to deployed in production.
Throughput Deployment frequency The number of deployments over a given period or the time between deployments.
Throughput Failed deployment recovery time The time it takes to recover from a deployment that fails and requires immediate intervention.
Instability Change failure rate The ratio of deployments that require immediate intervention following a deployment. Likely resulting in a rollback of the changes or a “hotfix” to quickly remediate any issues.
Instability Rework rate The ratio of deployments that are unplanned but happen as a result of an incident in production.
The five DORA metrics

Fast-forward to 2025, and the report now has benchmarks for all five DORA metrics, including rework rate. DORA benchmarks are updated every year and help teams and organizations compare against their peers and, more importantly, set realistic improvement goals, and track progress over time.

This year, the DORA report also moved away from traditional low/medium/high/elite performance designations to finer-grained per metric buckets.

Why was rework rate added as a 5th DORA metric?

The DORA research group had a hypothesis: Change Failure Rate (the ratio of deployments resulting in severe degradation or outage in production) works as a proxy for the amount of rework a team is asked to do. When a delivery fails, teams must fix the change, likely by introducing another deployment. 

To test this theory, they added a new survey question about rework rate: "For the primary application or service you work on, approximately how many deployments in the last six months were not planned but were performed to address a user-facing bug in the application?"

By measuring rework rate explicitly and analyzing it alongside change failure rate, the research group built a more reliable picture of software delivery stability. It’s no longer just, “Did we break production?” It’s also, “How often are we compelled to ship unplanned fixes because defects slipped through?” 

Those two signals, deployment instability and the subsequent churn it causes, provide a more holistically view of the impact of delivery issues. 

When deployments are smooth, teams are more confident about pushing changes to production, and end users are less likely to experience issues with the application. 

When deployments don’t go well, teams end up wasting precious time fixing issues, affecting team morale and delaying feature work, while end users get frustrated with a degraded experience.

Why rework rate is timely in the age of AI

Rework rate couldn't be more relevant given the rapid adoption of AI coding tools sweeping across engineering organizations.

Throughput goes up: More code, more experiments, more change velocity. But quality gates like reviews, tests, and staging checks don’t automatically scale with that pace. You can feel the tension in the day-to-day:

  • Pull requests get bigger and more frequent, which creates cognitive overload for reviewers and allows subtle regressions to sneak through.
  • Review queues back up, so feedback arrives later in the cycle, and more defects are discovered post‑merge.
  • After deployment, teams spend more time debugging and shipping unplanned fixes.

Faros AI's research quantifies these concerning downstream effects:

  • Code review time increases 91% as PR volume outpaces reviewer capacity
  • Pull request size grows 154%, lengthening review cycles and raising the risk that important details are missed
  • Bug rates climb 9% as quality gates struggle with larger diffs and increased volume
AI's impact on throughput and workflows
AI's impact on PR size and quality

In Stack Overflow’s 2025 Developer Survey, 84% of respondents indicated using or planning to use AI tools, yet trust in their accuracy has sagged. 

The most common pain point, reported by 66% of survey respondents, is encountering AI solutions that are “almost right.” And 45% say debugging AI‑generated code is more time‑consuming. In other words, the savings you expected up front can be eaten later in rework by the time spent inspecting, fixing, and re‑deploying.

In this environment, tracking rework rate carefully becomes essential. The benchmarks were first published this year, and it will be fascinating to see how they evolve in 2026 as AI adoption continues to accelerate.

Good news: You can start tracking rework rate today

If you’re eager to get insight into your teams’ performance, you can start tracking rework rate today in Faros AI—and nowhere else! Our DORA metrics dashboards  measure rework rate at a given point-in-time, trend it over weeks, months and years, and break down the results by organizational unit and the application or service (see tips below) to pinpoint where instability is concentrated.

A sample dashboard tracking the two instability metrics, CFR and rework rate, on Faros AI

This fifth DORA metric is now included as part of our Engineering Efficiency Solution, giving you the complete picture of your software delivery performance in the AI era. Don't wait to understand how AI tools are impacting your team's stability. Contact us to start measuring all five DORA metrics now. 

{{cta}}

Frequently asked questions about rework rate—the 5th DORA metric

How is rework rate measured?

Rework rate measures the percentage of deployments that were unplanned and performed to address user-facing bugs in your application. According to the DORA research group's definition, it's calculated by tracking deployments made specifically to fix defects that users encountered, rather than deployments that deliver new features or planned improvements.

Faros AI automatically identifies and classifies these unplanned deployments by analyzing your deployment data, linking it to incidents and bugs from your incident management and task management systems. This gives you an accurate, data-driven view without relying on manual surveys.

What should be the unit of analysis (team, app, service) and why?

The optimal unit of analysis depends on your organization's structure, but we recommend starting at the service or application level, then rolling up to teams.

Here's why:

  • Services/applications are where rework actually manifests. A single team might own multiple services with vastly different rework rates, and aggregating too early can mask problem areas.
  • Team-level analysis becomes powerful once you understand service-level patterns. It helps you identify whether rework issues are systemic to how a team operates or isolated to specific technical domains.
  • Organizational rollups are useful for executive dashboards, but drilling down is where you find actionable insights.

In Faros AI, you can analyze rework rate at any of these levels and easily pivot between views to understand where intervention is needed most.

Why measure rework rate separately from change failure rate?

The combination of both metrics gives you a complete picture:

  • CFR tells you: How often do we break production badly?
  • Rework rate tells you: How much unplanned work are we creating for ourselves?

This distinction is especially important in the AI era. As our data shows, AI tools are increasing PR volume and size while bug rates climb 9%. You might maintain a stable CFR through robust safeguards, but if your rework rate is climbing, you're accumulating technical friction that will eventually slow your throughput metrics (deployment frequency and lead time).

Together, these two instability metrics help you distinguish between "we ship fast and rarely break things catastrophically" versus "we ship fast with consistently high quality."

How do AI coding tools specifically impact rework rate?

AI coding tools create a paradox: individual developers write code faster, but the downstream effects can increase rework. Here's the mechanism:

Larger PRs (up 154%) mean reviewers have more cognitive load and less ability to spot subtle bugs. More PRs overall (causing 91% longer review times) means reviewers are rushed and may approve changes with less scrutiny. The combination leads to more defects reaching production, which our data confirms with a 9% increase in bug rates.

The key is to track rework rate alongside your AI adoption metrics. If you're seeing productivity gains but rework rate is climbing, you should invest in better automated testing, and strengthen your quality gates.

What's a good benchmark for rework rate?

The DORA Report 2025 published the first official benchmarks for rework rate. While we recommend reviewing the full report for detailed benchmarks, the key insight is that elite performers maintain significantly lower rework rates while sustaining high deployment frequency.

In Faros AI, you can compare your rework rate against these industry benchmarks and track your progress over time. Don’t panic if your current work rate is not on the top tier! The goal is to acknowledge the problem, set realistic goals for continuous improvement and understand the trend, especially as you adopt new tools and practices.

Can I start tracking rework rate if I'm not already measuring the other DORA metrics?

Absolutely! While rework rate is most powerful when viewed alongside the other DORA metrics, you can start tracking it independently. In fact, if you're currently using AI coding tools and concerned about quality, rework rate might be the single most important metric to baseline right now.

That said, we strongly encourage adopting all five DORA metrics together. They're designed as a system: throughput metrics show your speed, instability metrics reveal your quality, and the interplay between them tells you whether you're optimizing the right things.

Faros AI makes it easy to implement all five metrics at once, with automated data collection from your existing development tools—no manual surveys required.

{{cta}}

Thierry Donneau-Golencer

Thierry Donneau-Golencer

Thierry is Head of Product at Faros AI, where he builds solutions to empower teams and drive engineering excellence. His previous roles include AI research (Stanford Research Institute), an AI startup (Tempo AI, acquired by Salesforce), and large-scale business AI (Salesforce Einstein AI).

Connect
AI Is Everywhere. Impact Isn’t.
75% of engineers use AI tools—yet most organizations see no measurable performance gains.

Read the report to uncover what’s holding teams back—and how to fix it fast.
Discover the Engineering Productivity Handbook
How to build a high-impact program that drives real results.

What to measure and why it matters.

And the 5 critical practices that turn data into impact.
Want to learn more about Faros AI?

Fill out this form and an expert will reach out to schedule time to talk.

Loading calendar...
An illustration of a lighthouse in the sea

Thank you!

A Faros AI expert will reach out to schedule a time to talk.
P.S. If you don't see it within one business day, please check your spam folder.
Oops! Something went wrong while submitting the form.

More articles for you

Editor's Pick
AI
DevProd
9
MIN READ

Bain Technology Report 2025: Why AI Gains Are Stalling

The Bain Technology Report 2025 reveals why AI coding tools deliver only 10-15% productivity gains. Learn why companies aren't seeing ROI and how to fix it with lifecycle-wide transformation.
October 3, 2025
Editor's Pick
AI
DevProd
13
MIN READ

Key Takeaways from the DORA Report 2025: How AI is Reshaping Software Development Metrics and Team Performance

New DORA data shows AI amplifies team dysfunction as often as capability. Key action: measure productivity by actual collaboration units, not tool groupings. Seven team types need different AI strategies. Learn diagnostic framework to prevent wasted AI investments across organizations.
September 25, 2025
Editor's Pick
AI
DevProd
7
MIN READ

GitHub Copilot vs Amazon Q: Real Enterprise Bakeoff Results

GitHub Copilot vs Amazon Q enterprise showdown: Copilot delivered 2x adoption, 10h/week savings vs 7h/week, and 12% higher satisfaction. The only head-to-head comparison with real enterprise data.
September 23, 2025