A software engineering metrics glossary for business and technical leaders

A practical software engineering glossary for the AI era: pull requests, PR size, merge rate, code churn, incident rate, and the DORA metrics engineering teams use to measure AI's impact on productivity and quality.

Illustration of software engineering metrics and terms in white text on a red background

A software engineering metrics glossary for business and technical leaders

A practical software engineering glossary for the AI era: pull requests, PR size, merge rate, code churn, incident rate, and the DORA metrics engineering teams use to measure AI's impact on productivity and quality.

Illustration of software engineering metrics and terms in white text on a red background
Chapters

A software engineering glossary: key metrics explained

This software engineering glossary covers the terms that come up most often when measuring how engineering teams work and how well they deliver. For anyone reading engineering research, working alongside technical teams, or evaluating AI's impact on software development, here is what they actually mean.

The basics: how code gets written and shipped

Repository

A repository ("repo") is the central store where a software project's code lives. It contains every file that makes up the software, the full history of every change ever made to those files, and the branches where new work is developed before it is merged in. Most engineering teams work across multiple repositories, each corresponding to a different service, application, or component of their system.

Pull request

What is a pull request? When a developer completes a piece of work (new code, edited code, or deleted code) they do not add it directly to the shared codebase. They submit it as a pull request ("PR"): a package of changes, typically scoped to one repository, that is visible to the team and open for review before it is accepted. The name comes from asking the codebase to "pull in" the new code. A pull request is the fundamental unit of code delivery in modern software development.

{{whiplash}}

PR size 

PR size (sometimes called diff size) is the number of lines of code added or removed in a single pull request. Small PRs are considered good practice: they are easier to understand, faster to review, and simpler to roll back if something goes wrong. Large PRs touch more of the codebase at once, increase the risk of unintended side effects, and take significantly longer to review.

Code review 

Before a pull request is merged into the shared codebase, one or more colleagues read through it, looking for bugs, security issues, and anything that does not meet the team's standards. Code review is an important quality gate before code ships, though it is not the only one: automated tests also run during the build and deployment process to catch issues that human reviewers might miss. When review works well, it catches problems early, when they are cheap to fix. Review comments per PR and the length of those comments are useful signals of how much work reviewers are being asked to do, and indirectly, of how much the code needs.

PR merge rate 

PR merge rate represents how frequently pull requests are being accepted into the codebase, typically measured per developer over a given period. Because merging a PR represents a completed, reviewed unit of work entering the codebase, merge rate is one of the most common productivity metrics used to measure engineering output. A rising merge rate generally indicates higher output.

Code churn

Code churn is the rate at which code is modified, rewritten, or deleted shortly after it was written, typically measured within a three-week window of the original commit. Some amount of code churn is normal and expected: requirements change, engineers iterate, and early-stage code evolves. High churn becomes a concern when it is persistent, concentrated in specific areas of the codebase, or occurs at a scale that consumes significant engineering capacity without producing durable output. Code churn focuses on rework velocity close to the point of authorship.

Code deletion ratio

Code deletion ratio is the ratio of lines of code deleted to lines added for merged code within a given time period. Where code churn captures rework close to the point of authorship, the code deletion ratio operates over a longer window. Tracked over time and broken down by repository or application, the code deletion ratio can reveal whether high-deletion periods reflect productive architectural evolution, such as legacy systems being replaced, or whether they signal a pattern of rework that points to quality problems at the authoring stage.

Acceptance rate (AI-generated code) 

Acceptance rate is a measure of how much AI-generated code is making its way into the codebase, though what it captures depends on the tool. For autocomplete-style assistants, acceptance rate reflects how often a developer accepts a suggestion. For agent-based tools like Cursor or Claude Code, which apply changes directly, the metric works differently and is not always comparable. Accepted code is also not final: developers frequently accept AI output and then edit or delete significant portions of it. Despite these nuances, acceptance rate is directionally useful. A rising acceptance rate signals that AI-generated contributions are playing a larger role in what gets written, which is relevant context for interpreting quality and output metrics downstream.

Agentic PRs 

Agentic PRs is when an AI agent is assigned a task, autonomously writes the code to complete it, and opens a pull request without a human writing the code directly. Distinct from AI-assisted development, where a human writes with AI support and retains direct authorship of every change.

How work flows through a team

Engineering work is typically tracked as a hierarchy of units, from large strategic efforts down to individual defects.

Epic 

An epic is a large, multi-sprint body of work tied to a meaningful product initiative, such as launching a new feature, rebuilding a core system, or delivering a significant capability to users. Epics sit above individual tasks in most project management systems and represent the kind of work that has visible business value attached to it. Unlike a single task or bug fix, completing an epic typically means something material has shipped. Epic completion rate is one of the metrics most directly tied to organizational outcomes.

Task 

A task is a discrete unit of work within an epic or project, scoped to something a developer can complete in a defined period. Tasks vary widely in size and type, from writing documentation to implementing a specific function. They are the day-to-day unit of planning and tracking for most engineering teams.

Bug 

A bug is a defect in software that causes it to fail or produce incorrect results. Bugs are typically logged as tasks in a team's project management system (for example, as a Jira ticket) and prioritized based on severity. A bug that reaches production, one that users encounter in a live system, is more costly to fix than one caught during development or review.

Incident 

An incident is a severe failure in a production system: an outage, a security event, or a malfunction that affects real users or operations. Incidents are distinct from bugs in that they represent active failures in live systems, not just defects waiting to be fixed. Incidents per PR is a particularly useful normalized measure because it controls for changes in deployment volume, revealing whether the probability of failure per code change is rising or falling.

Deployment

Deployment is the process of releasing code from development into a live environment where real users can access it. A deployment is the moment software moves from being written and tested to being in production. Deployments can range from a major release that introduces new features to a small patch fixing a single bug.

Throughput 

Throughput is a measure of how much work is being completed over a given period, applied across any of the units above. Epic throughput measures how many large initiatives are being completed. Task throughput measures day-to-day output. PR throughput measures code delivery. Throughput figures are most meaningful when tracked over time within a consistent organizational context and broken down by work type, since a rise in task throughput does not necessarily mean more meaningful work is being shipped.

Context switching 

Context switching refers to moving between different pieces of work rather than sustaining focus on one. In software development, context switching carries a documented cost: rebuilding mental context after an interruption takes significant time before a developer can work effectively again. Metrics like daily PR contexts per developer and daily task contexts per developer measure how many parallel threads a developer is managing at once. Rising context switching is generally considered a negative signal, though the relationship is being re-examined as AI tools change how developers work.

Work in progress (WIP) 

Work in progress refers to work items that have been started but not yet completed. High WIP often signals that a team is starting more work than it can finish, leading to bottlenecks, longer cycle times, and tasks that stall mid-flow. In-progress tasks with no activity for seven or more days are a useful indicator of work that has been claimed but is not moving.

DORA metrics: the industry standard for delivery performance

The DORA metrics were developed by the DevOps Research and Assessment team, now part of Google, through years of research into what separates high-performing engineering organizations from the rest. They have become the closest thing the industry has to a standard framework for measuring software delivery. The framework currently comprises five metrics.

Deployment frequency 

Deployment frequency is a measure of how often a team deploys code to production. Higher deployment frequency is generally associated with better engineering performance: teams that deploy frequently tend to work in smaller batches, catch problems earlier, and recover from failures faster. A drop in deployment frequency, even when code volume is rising, often indicates a bottleneck somewhere in the delivery pipeline.

Lead time for changes

Lead time is the time from when code is committed by a developer to when it is running in production. Short lead times indicate an efficient, low-friction delivery process. Long lead times mean code is sitting in queues, waiting on review, stuck in testing, or delayed by manual processes. Lead time is one of the most direct measures of how quickly an engineering organization can respond to a need.

Failed deployment recovery time

Formerly known as mean time to recover (MTTR), failed deployment recovery time measures how long it takes to restore service after a deployment causes a failure, specifically from the moment an incident is detected to when customer impact ends. Fast recovery is a sign of good observability, clear ownership, and practiced incident response. Slow recovery means failures linger and affect users for longer. One important caveat: a single average recovery time can be misleading. A team resolving most incidents in five minutes but occasionally taking two hours looks very different from one that consistently takes thirty minutes. The distribution matters as much as the number. DORA redefined this metric in 2023 to focus specifically on failures caused by software changes, distinguishing them from failures caused by external factors like infrastructure outages.

Change failure rate

Change failure rate (CFR) is the percentage of deployments that cause a failure in production requiring immediate attention, whether a rollback, a hotfix, or an emergency patch. A lower change failure rate indicates a more stable and reliable delivery process. Elite engineering teams maintain low change failure rates while also deploying frequently, demonstrating that speed and stability are not inherently in conflict.

Rework rate

Rework rate was added as the fifth official DORA metric in 2024, rework rate measures the percentage of deployments that were unplanned and performed specifically to fix a user-facing bug. Where change failure rate captures whether a deployment caused a problem, rework rate captures how much of a team's deployment capacity is being consumed by fixing problems rather than shipping new value. High rework rate is a strong signal that quality issues are escaping into production at a significant rate.

These metrics are most useful when tracked together and over time within a consistent organizational context. A single metric in isolation rarely tells the full story; the patterns across metrics, and how they shift as teams adopt new tools and practices, are where the signal lives.

Why these metrics matter more than ever 

AI is changing what engineering teams produce, how fast they produce it, and how much of it holds up in production. The metrics in this glossary are the instruments that make those changes visible. For a look at what they are currently showing across 22,000 developers, read The AI Engineering Impact Report 2026: The Acceleration Whiplash.

{{whiplash}}

Naomi Lurie

Naomi Lurie

Naomi Lurie is Head of Product Marketing at Faros. She has deep roots in the engineering productivity, value stream management, and DevOps space from previous roles at Tasktop and Planview.

AI Is Everywhere. Impact Isn’t.
75% of engineers use AI tools—yet most organizations see no measurable performance gains.

Read the report to uncover what’s holding teams back—and how to fix it fast.
Cover of Faros AI report titled "The AI Productivity Paradox" on AI coding assistants and developer productivity.
Discover the Engineering Productivity Handbook
How to build a high-impact program that drives real results.

What to measure and why it matters.

And the 5 critical practices that turn data into impact.
Cover of "The Engineering Productivity Handbook" featuring white arrows on a red background, symbolizing growth and improvement.
Graduation cap with a tassel over a dark gradient background.
AI ENGINEERING REPORT 2026
The Acceleration 
Whiplash
The definitive data on AI's engineering impact. What's working, what's breaking, and what leaders need to do next.
  • Engineering throughput is up
  • Bugs, incidents, and rework are rising faster
  • Two years of data from 22,000 developers across 4,000 teams
Research
7
MIN READ

Ten takeaways from the AI Engineering Report 2026: The Acceleration Whiplash

What two years of telemetry data from 22,000 developers reveals about AI's real impact on developer productivity, code quality, and business risk in 2026.

Customers
10
MIN READ

An industrial technology leader lays the foundation for AI transformation with Faros

Learn how a global industrial technology leader used Faros to unify 40,000 engineers and build the measurement foundation for AI transformation.

Customers
10
MIN READ

A leader in independent identity verification measures AI impact with Faros

Learn how a leading identity security provider uses Faros to power an AI-driven engineering organization and achieve a 35% increase in velocity.