Team & Culture 10 min readJuly 30, 2024

Why Engineering Quality Drops When Teams Scale (And the One Thing That Prevents It)

Every engineering leader knows that quality erodes as teams grow. Almost none can explain precisely why — or address the root cause rather than the symptoms.

There's a specific size at which engineering quality begins to degrade in growing companies, and it's more consistent than most engineering leaders expect: somewhere between 25 and 40 engineers. Below that threshold, standards are maintained largely through osmosis — engineers are close enough to the founding team that quality expectations are transmitted through direct interaction, code review, and the visible example of senior engineers. Above that threshold, osmosis stops working and something structural is needed. Most companies don't have the structural thing ready when they need it.

Why Osmosis Fails at Scale

When a founding engineer reviews a junior engineer's code, they're not just verifying correctness — they're transmitting a model of what "good" looks like in this specific codebase, for this specific team, solving these specific problems. This model includes things that are never written down: the level of abstraction that's valued, the tolerance for complexity, the expected relationship between feature velocity and test coverage, the implicit security assumptions that underpin the architecture.

At 15 engineers, the founding team can review most of the code most of the time. At 40 engineers, they can review maybe 15% of it. The other 85% is reviewed by engineers who received the model second or third hand, who may have interpolated parts of it incorrectly, and who are now transmitting their interpolated version to the engineers they review. This is how quality degrades: not catastrophically, but through accumulated drift in what "good enough" means.

The Standards Debt Problem

The underlying problem is that most early-stage engineering teams accumulate what I call standards debt: a large and growing body of implicit knowledge about how things should be done that exists only in the heads of the longest-tenured engineers and is never systematically codified. This debt is invisible when the team is small. It becomes a serious liability when the team grows past the osmosis threshold and there's nothing to replace it.

Standards debt is distinct from technical debt. You can have a clean, well-maintained codebase with enormous standards debt — the code looks good because the people who wrote it knew what good looked like, but there's nothing that would help a new engineer understand what good looks like without learning it from a specific person.

Codifying Standards Without Creating Bureaucracy

The solution to standards debt is codification — making the implicit explicit — but this needs to be done in a way that doesn't create a bureaucratic compliance burden that engineers route around. The most effective mechanisms are lightweight, living documents that capture decisions and their reasoning rather than rules and their enforcement.

Architecture Decision Records (ADRs) are the single highest-leverage codification practice most teams don't use: short documents that capture a significant architectural decision, the context that motivated it, the alternatives considered, and the reasoning behind the choice. An ADR repository of 20-30 decisions covers most of the institutional knowledge that would otherwise live only in the heads of the founding engineers.

Review standards documents — not style guides, but documents that articulate what reviewers are supposed to be looking for and why — give new reviewers a framework that would otherwise take months of osmosis to absorb.

Automated Standards as a Complement to Documented Standards

Documentation alone isn't sufficient because documents get outdated and ignored. The most effective quality maintenance systems combine documented standards with automated enforcement of the most important ones. Linters enforce style. Type checkers enforce interface contracts. AI code review enforces the pattern-level standards — the security checks, the error handling requirements, the performance anti-patterns — that are too nuanced for simple linting rules but too consistent to require human judgment on every occurrence.

The goal is to automate the standards that are objective and consistent enough to be reliably checked, and document the standards that require judgment — freeing human reviewers to exercise judgment rather than repeat mechanical checks that automation handles more consistently.

Engineering quality at scale is an organizational design problem, not a talent problem. The companies that maintain quality through growth are the ones that treat standards as infrastructure and invest in that infrastructure with the same rigor they apply to technical infrastructure. The ones that don't believe that hiring better engineers will fix it. It won't.

Try CodeMouse on your next PR

Free AI code review on every pull request. Bring your own API key — no subscription needed.

Install on GitHub — Free

Why Engineering Quality Drops When Teams Scale (And the One Thing That Prevents It)

Why Osmosis Fails at Scale

The Standards Debt Problem

Codifying Standards Without Creating Bureaucracy

Automated Standards as a Complement to Documented Standards

Try CodeMouse on your next PR

More from the blog

We Automated 10 Million Code Reviews. Here's What We Learned.

The PR Size Problem: Why Your Biggest Reviews Are Your Riskiest Deployments

The 7 Security Vulnerabilities Most Likely to Survive Your Code Review