• Products
  • Pricing
  • Resources
  • Changelog
  • About Us
    Sign In
    Get Started
AI

Is GitHub Copilot Worth It? Real-World Data Reveals the Answer

GitHub execs say Copilot aims to make developers 10x more productive. Being the data-driven folks that we are, we put it to the test.

Thomas Gerber

Browse chapters

1
Introduction

Share

October 3, 2023

Lately, there's been a lot of chatter about GitHub Copilot in our developer circles. Every peer I’ve spoken to lately has told me they are in the exact same position as me: trying to figure out if GitHub Copilot is worth it, and whether they should pay the extra bucks for every developer.

GitHub execs say they aim to make developers 10x more productive. So, being the data-driven folks that we are, we decided to put it to the test.

Watch Now: GitHub and Faros teamed up to co-design a new AI Transformation impact analysis framework. See it in action.

Introduction

If you've been out of the loop, GitHub Copilot is an AI-powered coding assistant that's been making waves. Within five months of its official launch, 20,000 companies were trying the new tech.

The big question on everyone's mind: Does it live up to the hype? Should it become the default for every single developer?

Well, instead of relying on hearsay, we ran a good old-fashioned experiment at our company. Here's what we found.

Background

To keep things fair and square, we split our team into two random cohorts — one armed with GitHub Copilot (around a third of our developers) and the other without. We made sure the cohorts were not biased in any way (e.g., that one wasn’t stacked exclusively with our most productive developers).

Over three months, we closely monitored various performance metrics, focusing on speed, throughput, and quality. Our goal? A clear, unbiased view of GitHub Copilot's impact.

Why these metrics? They're tangible and measurable, and they directly impact our deliverables. They also give us a holistic picture. We don’t want to gain speed if there’s a huge price to pay in quality. Finally, it would give us a good indication of areas we might need to strengthen in our practices or process if we want to fully go down the GitHub Copilot route.

Results

The data was pretty revealing. The group using GitHub Copilot consistently outperformed the other cohort in terms of speed and throughput over the evaluation period (May-September 2023).

Let’s start with throughput.

Over the pilot period, the GitHub Copilot cohort gradually began to outpace the other cohort in terms of the sheer number of PRs.

Faros AI chart showing pull request merge rate for the GitHub Copilot pilot

Pull Request Merge Rate cohort comparison, with and without GitHub Copilot

Next up, I looked at speed.

I examined the Median Merge Time to see how quickly code was being merged into the codebase. The GitHub Copilot cohort’s code was consistently merged approximately 50% faster. The Copilot cohort improved relative to its previous performance and relative to the other cohort.

Faros AI chart comapring median merge time, with and without GitHub Copilot

Median Merge Time cohort comparison, with and without GitHub Copilot

The most important speed metric, though, is Lead Time to production. I wanted to make sure that the acceleration in development wasn’t being negated by longer time spent in subsequent stages like Code Review or QA.

It was great to see that Lead Time decreased by 55% for the PRs generated by the GitHub Copilot cohort (similar to GitHub’s own research), with most of the time savings generated in the development (“Time in Dev”) and code review (“First Review Time”) stages.

A Faros AI chart comparing lead time with cycle time breakdowns, with and without GitHub Copilot

Lead Time comparison with cycle time breakdown, with and without GitHub Copilot

The last dimension we analyzed was code quality and code security, where I looked at three metrics: Code Coverage, Code Smells, and Change Failure Rate.

  • Code Coverage improved, which didn’t surprise me. Copilot is very good at writing tests.
  • Code Smells increased slightly but were still beneath an acceptable threshold.
  • Change Failure Rate — the most important metric to me together with Lead Time — held steady.
Faros AI chart comparing code coverage, with and without GitHub Copilot

Code Coverage comparison, with and without GitHub Copilot

Analysis

But why did Copilot make such a noticeable difference? The engineers in our Copilot cohort said the boost is largely due to no longer starting from a blank page. It’s easier to edit an AI-driven suggestion than starting from scratch. You become an editor instead of a journalist. In addition, Copilot is great at writing unit tests quickly.

Whatever it is, the difference was clear and measurable.

Cost-Benefit Analysis

Now, the juicy bit – is the performance boost worth the cost? For us, the answer's leaning towards a solid "yes." A 55% improvement in lead time with no collateral damage to code quality is a phenomenal ROI. But, of course, every team's dynamics are different. If you're weighing the costs, consider not just the subscription fee but the potential long-term benefits in productivity and code quality.

Don't have budget for Copilot? Read our guide to getting approval for AI tools outside normal budgeting cycles.

Tips for Conducting Your Own Assessment

As I mentioned, lots of my peers want to create a similar analysis at their org. Today it’s GitHub Copilot, tomorrow it’ll be something else.

What made generating this comparison easy for me was two-fold:

  • I’m already tracking developer productivity metrics in Faros AI, based on data it knits together from Jira, GitHub, Buildkite, Statuspage, and PagerDuty.
  • Unlike cookie-cutter metrics tools, Faros AI has a complete, flexible BI layer that made it easy for me to define my two cohorts and create a custom dashboard for this specific analysis. It took me just a few minutes to generate my GitHub Copilot analysis dashboard.

Learn how GitHub and Faros AI co-designed an AI Transformation dashboard

Conclusion

So, back to our main question: Is GitHub Copilot worth the investment? Our data shouts a resounding "yes." But hey, tools are only as good as how we use them. It might be the perfect fit for some, while others might find alternative methods more suited to their workflow. Plus, if you have bottlenecks in your build and test cycles, your efficiency gains may be reduced.

The next big question organizations are going to face is where to direct the developer productivity they’ve just unleashed. If you’re going to embrace GitHub Copilot, you need to have a plan. There’s no shortage of roadmap initiatives and technical debt for folks to sink their teeth into, but leaders should be setting those priorities ahead of time.

I'm curious, have any of you taken GitHub Copilot for a spin? What's your verdict? Did your experience mirror ours or did you discover something entirely different? Get in touch with me at thomas@faros.ai.

Back to blog posts

More articles for you

See what Faros AI can do for you!

Global enterprises trust Faros AI to accelerate their engineering operations.
Give us 30 minutes of your time and see it for yourself.

Request a Demo