Software engineering with AI demands more than better tools

At enterprise scale, software engineering with AI demands a connected system spanning strategy, tooling, cost management, adoption, measurement, governance, and the context layer that makes AI output production-ready.

red background with a white iceberg

Software engineering with AI demands more than better tools

At enterprise scale, software engineering with AI demands a connected system spanning strategy, tooling, cost management, adoption, measurement, governance, and the context layer that makes AI output production-ready.

red background with a white iceberg
Chapters

Software engineering with AI in 2026

Software engineering with AI is the practice of building and operating software engineering organizations where AI tools and agents contribute materially to design, coding, review, testing, and deployment. It has moved well past the pilot phase, as nearly every enterprise engineering organization now runs developer-facing AI tools in production. In 2026, the challenges facing most AI leaders are what sits downstream of AI adoption: governing AI-generated code at scale, measuring what the code is actually producing in the system, and running an engineering organization where AI authors more of the work than humans do.

The research that follows makes those challenges sharper in the numbers. Software engineering with AI is a categorically different system from anything engineering organizations have run before. Getting it right breaks down into seven distinct but overlapping areas, and neglecting any of them pushes the strain onto the others until the whole system starts to fray.

The Acceleration Whiplash: What the latest research says about AI in software development

The latest data on AI in software development is from Faros’s 2026 AI Engineering Report, which synthesizes two years of telemetry from 22,000 developers across more than 4,000 teams. 

60% of AI-generated code is now being accepted into codebases, up from 20% a year earlier. AI has crossed the threshold from suggesting code to writing it.

The throughput gains are real. Task completion per developer is up 34% under high AI adoption. Epics completed per developer are up 66%. Code-related tasks have risen 210% at the team level. 

The downstream numbers tell a different story. Bugs per developer are up 54%. The incident-to-PR ratio has more than tripled. Median PR review time has grown 5x, and 31% more PRs are now merging without any review at all. 

Faros terms this the Acceleration Whiplash. AI has flooded a system built around human-paced development and human-quality code with output it was never designed to absorb. 

One finding cuts across every segment of the data: engineering maturity does not protect against this shift. Organizations with solid pre-AI practices and strong DORA metrics are experiencing the same quality deterioration as less mature organizations. Strong foundations are necessary, but they are nowhere near sufficient.

Illustration of key findings from the Acceleration Whiplash, the AI Engineering Report 2026.
Key findings from the Acceleration Whiplash, the AI Engineering Report 2026.

{{whiplash}}

What running software engineering with AI actually means

AI is changing how engineering leaders build products and run their organizations, and the playbook for doing it well is still being written. Buying more tools and pushing greater adoption won't close the gap the research is pointing at. Running software engineering with AI at enterprise scale requires a connected system of practices that spans strategy, tooling, cost management, adoption, measurement, governance, and the context layer that makes AI output production-ready. Each pillar is a body of work on its own, and none of them hold up in isolation.

1. AI transformation planning and strategy

AI strategy and transformation planning is the discipline of moving AI adoption from pilot sprawl into a scalable, prioritized, ROI-backed program accountable to business outcomes.

To enable AI transformation, engineering leaders must have complete visibility into where AI usage sits across the organization, which licenses are dormant, and how AI is changing productivity, velocity, and quality across the business. A defensible baseline reshapes the prioritization conversation, while benchmarking against industry leaders and quantifying the ROI of each candidate intervention converts the investment case into numbers the CFO can sign off on. The harder work is downstream, where strategy has to land in team-level assignments with cross-functional accountability and KPIs anchored to measurable business outcomes. See how Faros helps you prepare your engineering workforce for AI-powered excellence. 

2. AI coding tools comparison

Tooling evaluation, selection, and optimization is the discipline of measuring AI coding tools and autonomous agents against system telemetry, and continuously tuning the stack as capabilities and workloads shift.

The AI coding landscape now spans IDE assistants, chat-based tools, autonomous agents, and review and test specialists. As most enterprises run a combination of AI tools, rigorous evaluation starts with objective telemetry on the downstream impact of each tool: merged PR quality, cycle time, incident rate on AI-authored code, review burden, and cost per merged outcome. Developer sentiment is another signal which can be overlaid in evaluation, because a tool engineers abandon after the onboarding window isn't earning its cost regardless of benchmark performance. 

3. AI cost management and optimization

Cost and spend management for AI coding tools is the discipline of tying AI investment to downstream engineering outcomes rather than to seat count or license volume.

AI coding pricing has moved from flat per-seat to consumption-based tiers, with token allocations that reset on rolling windows. In late 2025, Claude Code, for example, ran 44k / 88k / 220k tokens per 5-hour window by tier, with weekly caps layered on top, and a single Opus-heavy agentic session could consume what a team budgeted for the month. Model mix drives the invoice more than headcount does. Defensible spend management pairs cost data with output: baseline DORA metrics before rollout, then compare individual token burn against shipped work on a quarterly cadence. Negative-ROI users surface quickly, licenses reallocate from low-value to high-value use cases, and manager 1:1s gain a data point that reshapes behavior without waiting on a CFO review.

4. Scaling AI adoption and usage

Scaling adoption is the work of moving AI coding tools from isolated pilots to consistent, measured usage across the engineering organization.

Scaling AI adoption across a global enterprise engineering org typically runs in waves. A small group of power users and internal champions pilots the tools, codifies what works into prompt libraries, instruction files, playbooks, and sample PRs, then hands that material to enablement leads who run team-by-team rollout. Additionally, engineering leaders can use IDE and tool telemetry to determine which seats are active versus idle, which teams are stuck, and which patterns deserve to propagate. As AI in engineering continues to grow, executive sponsorship matters, because engineers can rarely carve out time for new workflows on their own. The result is a usage baseline the rest of the program can measure and govern against.

5. Measuring AI impact and outcomes

Measuring AI impact is the practice of capturing AI's effects on system-level outcomes across the full SDLC, rather than developer activity inside any single tool.

Activity metrics like lines authored, PRs opened, and suggestions accepted say little about the benefits  AI is actually producing. To measure the effects of AI in software development, engineering leaders should consider four dimensions: velocity, quality, security, and developer satisfaction. Lead with objective telemetry signals (task management, IDEs, static analysis, version control, CI/CD, and incident management), and then overlay with developer survey data periodically to understand sentiment and friction. Most organizations use task throughput, PR merge rate, cycle time, and defect rates. Furthermore, factors like seniority, repo, and team composition confound raw correlation, so causal analysis can isolate AI's real effect

6. AI risk, governance and control

Risk, governance, and control is the practice of encoding review, security, and agent-scope policy in the delivery pipeline itself, enforced at the moment of change rather than after an incident.

As AI writes more of the code and agents take on work without a human in the loop, governance has to move out of the wiki and into the pipeline. The 31% of PRs now merging without human review is what happens when it doesn't. The fix is to route scrutiny by risk rather than by author. Path-based rules put senior eyes on the code where incidents actually start, while agent permission scopes and PR size caps keep a small task from quietly mutating forty files or sliding through a shallow review. Version pinning and provenance tagging surface silent degradation before it compounds, and a kill switch gives you a way to pause agent activity when something's going sideways. Done well, AI in engineering scales safely and securely, with fewer downsides coming along with it.  

7. Harness engineering

Harness engineering is the discipline of architecting an AI agent's entire information ecosystem, so agent output lands in the codebase with the same intent, standards, and constraints human engineers work from.

Prompt engineering is giving an agent a task. Context engineering is giving it access to the codebase, git history, dependencies, team standards, and test patterns it needs to do that task the way your human team would. One way to provide that context is through high-quality, information-rich Jira tickets. A ticket with a clear objective, explicit acceptance criteria, linked artifacts, and the right issue type gives an agent the same starting point a senior engineer would have—turning the task specification itself into a reliable input for execution rather than a source of ambiguity the agent has to guess its way through. When organizations skip steps like these, AI agents default to plausible-looking code that violates patterns the codebase depends on. The Acceleration Whiplash report put a number on the cost: code churn is up 861% under high AI adoption, meaning much of what AI writes is being removed soon after it lands. The practical work sits in task specs with real objectives and success criteria, a repo-level AGENTS.md that encodes the team's "good" and "bad" patterns, and feedback loops that let what ships and what gets reverted shape the next output. "Connect AI to internal context," one of the Whiplash report's direct prescriptions, is harness engineering in a nutshell.

Software engineering with AI: Charting the path forward

The playbook for doing software engineering with AI well is still being written, and it isn't getting any easier. But there's real reason for optimism. AI is changing how engineering leaders build products and run their organizations, and the ones pulling ahead are synthesizing the pieces across all seven of these pillars rather than focusing on the two or three that feel most urgent this quarter. The next couple of years come down to understanding and optimizing the operating model underneath AI.

Faros is the system for running engineering with AI. We give engineering leaders visibility into how work operates across code, people, and systems, and control over how that work progresses through enforceable workflows and policy. This enables organizations to deploy AI effectively and improve engineering throughput with stronger cost efficiency. Request a demo to see what Faros can do for you.

Frequently asked questions about software engineering with AI

What is software engineering with AI? 

Software engineering with AI is the practice of building and operating software engineering organizations where AI tools and agents contribute materially to design, coding, review, testing, and deployment. By 2026, most enterprise engineering teams have AI tooling in production and are working on how to govern, measure, and scale it without eroding quality.

What is the Acceleration Whiplash? 

The Acceleration Whiplash is the term coined in Faros's 2026 AI Engineering Report for the widening gap between AI throughput gains and downstream quality, cost, and incident metrics. Research across 22,000 developers shows task completion up 34% and epics up 66%, alongside bugs up 54% and an incident-to-PR ratio more than tripled.

How do you measure the ROI of AI coding tools? 

ROI measurement for AI coding tools requires connecting tool usage to system-level outcomes across four dimensions: velocity, quality, security, and developer satisfaction. Organizations often use the SPACE framework (which covers Satisfaction, Performance, Activity, Communication, and Efficiency), in addition to key metrics such as task throughput, PR merge rate, cycle time, and defect rates. Causal analysis separates AI's real effect from confounds like seniority and repository.

Why does AI adoption increase bugs and incidents? 

AI sharply increases the volume of code reaching the codebase, and engineering systems built around human-paced review were not designed to absorb that volume. Faros's Acceleration Whiplash research found bugs per developer up 54%, the incident-to-PR ratio more than tripled, and 31% more PRs merging without any human review. The strain concentrates in review, incident, and context layers downstream of the tool.

What is harness engineering?

Harness engineering is the discipline of orchestrating an AI agent's entire information ecosystem so agent output lands in the codebase with the same intent, standards, and constraints human engineers work from. It includes the codebase, git history, dependencies, team standards, test patterns, and feedback loops that let what ships or gets reverted shape the next output.

Does engineering maturity protect against AI quality issues? 

No. Research across 22,000 developers found that organizations with solid pre-AI practices and strong DORA metrics are experiencing the same downstream quality deterioration as less mature teams. Strong foundations are necessary for running AI coding at scale but are nowhere near sufficient.

Neely Dunlap

Neely Dunlap

Neely Dunlap is a content strategist at Faros who writes about AI and software engineering.

AI Is Everywhere. Impact Isn’t.
75% of engineers use AI tools—yet most organizations see no measurable performance gains.

Read the report to uncover what’s holding teams back—and how to fix it fast.
Cover of Faros AI report titled "The AI Productivity Paradox" on AI coding assistants and developer productivity.
Discover the Engineering Productivity Handbook
How to build a high-impact program that drives real results.

What to measure and why it matters.

And the 5 critical practices that turn data into impact.
Cover of "The Engineering Productivity Handbook" featuring white arrows on a red background, symbolizing growth and improvement.
Graduation cap with a tassel over a dark gradient background.
AI ENGINEERING REPORT 2026
The Acceleration 
Whiplash
The definitive data on AI's engineering impact. What's working, what's breaking, and what leaders need to do next.
  • Engineering throughput is up
  • Bugs, incidents, and rework are rising faster
  • Two years of data from 22,000 developers across 4,000 teams
Research
7
MIN READ

Ten takeaways from the AI Engineering Report 2026: The Acceleration Whiplash

What two years of telemetry data from 22,000 developers reveals about AI's real impact on developer productivity, code quality, and business risk in 2026.

Blog
8
MIN READ

A software engineering metrics glossary for business and technical leaders

A practical software engineering glossary for the AI era: pull requests, PR size, merge rate, code churn, incident rate, and the DORA metrics engineering teams use to measure AI's impact on productivity and quality.

Customers
10
MIN READ

An industrial technology leader lays the foundation for AI transformation with Faros

Learn how a global industrial technology leader used Faros to unify 40,000 engineers and build the measurement foundation for AI transformation.