AI & Engineering 10 min readAugust 13, 2023

The Future of Code Review in the AI Era: What Changes and What Doesn't

AI is transforming software development faster than most practitioners expected. Code review is being transformed with it — but not in the direction most people assume.

Three years ago, the question "will AI replace code review" would have seemed premature. Today it's a real strategic question for engineering organizations making decisions about their review processes. The answer, based on what we've seen processing millions of reviews, is more nuanced than either the optimists or the pessimists suggest.

AI will automate the mechanical layer of code review completely — and has already substantially done so for teams using AI review tools consistently. The substantive layer of code review, which is about as different from the mechanical layer as plumbing is from architecture, will not only survive AI automation but become more important as it becomes the non-automatable part of the review process.

What AI Review Handles Well Today

The mechanical layer of code review — finding null dereferences, flagging security vulnerabilities matching known patterns, identifying N+1 queries, spotting naming inconsistencies, checking error handling completeness — is well-handled by current AI review tools. These patterns are consistent enough across codebases that they can be trained reliably, and frequent enough that automating them provides substantial value.

Our data shows that AI review catches roughly 65-70% of the issues that experienced human reviewers flag as significant, when evaluated against a held-out set of real production bugs. The remaining 30-35% is the substantive layer: issues that require understanding the intended behavior of the system, the business context of the change, the architectural vision of the codebase, and the non-obvious constraints that the code must satisfy.

What AI Review Handles Poorly

The current generation of AI review tools is systematically poor at three categories of issues. Intent correctness: did this code solve the right problem? AI can verify that the code does what it appears to do; it cannot easily verify that what it appears to do is what the product or business needed. Architectural coherence: does this change fit into the design of the system, or does it create a pattern that will cause problems as the system evolves? This requires a model of the system's history and trajectory that AI systems don't have access to. Cross-codebase context: does this change have implications elsewhere in the codebase that aren't visible in the diff? AI review sees the changed files; it doesn't have efficient access to the full context of how those files interact with the rest of the system.

The Human Review That Emerges

When AI handles the mechanical layer reliably, human review changes character. Reviewers are freed from the exhausting pattern-checking work that occupies most of a typical review and can focus entirely on the substantive questions: Is this solving the right problem? Does this fit the architecture? Are the assumptions correct? Is there a simpler approach? These are the questions that require genuine human understanding and judgment — and they're the questions that produce the most valuable review conversations.

The emergent model is a three-layer review process: AI handles pattern-level issues immediately and consistently; human review handles intent and architecture; and the author handles the integration between them. This produces reviews that are faster (AI feedback is immediate), more thorough (AI doesn't miss patterns from fatigue), and more substantive (human attention goes to the questions only humans can answer).

The Skills That Become More Valuable

As AI handles more of the mechanical review work, the skills that differentiate excellent reviewers become more specifically human: systems thinking, product judgment, the ability to model how a change will interact with the system's future evolution, and the interpersonal intelligence to deliver substantive feedback in a way that produces learning and collaboration rather than defensiveness. These skills were always the most valuable part of code review. They'll become more obviously so as everything automatable is automated. Invest in developing them deliberately — in yourself and in your team.

Try CodeMouse on your next PR

Free AI code review on every pull request. Bring your own API key — no subscription needed.

Install on GitHub — Free

The Future of Code Review in the AI Era: What Changes and What Doesn't

What AI Review Handles Well Today

What AI Review Handles Poorly

The Human Review That Emerges

The Skills That Become More Valuable

Try CodeMouse on your next PR

More from the blog

We Automated 10 Million Code Reviews. Here's What We Learned.

The PR Size Problem: Why Your Biggest Reviews Are Your Riskiest Deployments

The 7 Security Vulnerabilities Most Likely to Survive Your Code Review