There’s a certain comfort in seeing a pull request pass all checks. Green builds, no failing tests, everything looks solid.
But here’s the uncomfortable question: what exactly did your tests cover?
That’s where code coverage enters the conversation—not as a vanity metric, but as a signal. When used correctly, it helps you understand not just whether your code works, but how much of it you’ve actually verified.
Coverage Is a Map, Not a Score
At its core, code coverage measures how much of your codebase is executed when your test suite runs. The most common types include:
Line coverage – which lines were executed
– which lines were executed Branch coverage – which logical paths (if/else, switch cases) were taken
– which logical paths (if/else, switch cases) were taken Function/method coverage – which functions were invoked
Most tools default to line coverage because it’s easy to compute and understand. But it can also be misleading.
A test that executes a line doesn’t necessarily validate its correctness. You can hit 100% line coverage and still miss critical bugs—especially around edge cases and branching logic.
So instead of treating coverage as a goal, it’s more useful to treat it as a map of untested territory.
Why Coverage Matters in CI/CD
CI/CD pipelines are about confidence. Every commit that flows through your pipeline is a potential production deployment. Coverage acts as a guardrail—not by guaranteeing correctness, but by highlighting risk.
When integrated into CI/CD, coverage helps you:
Detect untested code introduced in a pull request
Prevent silent degradation of test quality over time
Encourage consistent testing practices across contributors
More importantly, it creates accountability. Without coverage checks, it’s easy for teams to gradually stop writing meaningful tests—especially under delivery pressure.
The Baseline Question: How Much Is Enough?
This is where things get opinionated.
You’ll often hear numbers like 70%, 80%, or 90% thrown around as “good coverage.” In reality, the right number depends on your domain:
A prototype or internal tool might tolerate lower coverage
A financial or healthcare system should aim much higher
Legacy systems often start low and improve incrementally
What matters more than the number itself is consistency and trend.
A stable 75% with thoughtful tests is far more valuable than a forced 90% filled with shallow assertions.
That said, many teams adopt practical thresholds:
80% line coverage as a general baseline
as a general baseline Higher thresholds (85–90%) for critical modules
for critical modules Lower thresholds temporarily when dealing with legacy code
The key is to treat thresholds as minimum quality gates, not targets to game.
Enforcing Coverage in a Pipeline
Modern CI systems make it straightforward to enforce coverage, but the implementation details matter.
A typical flow looks like this:
Run tests with coverage enabled Generate a coverage report (e.g., XML, HTML, or JSON) Compare results against a defined threshold Fail the pipeline if the threshold is not met
For example, in a PHP/Laravel setup using PHPUnit:
test: script: - php artisan test --coverage --min=80
Or with more control using PHPUnit directly:
phpunit --coverage-clover=coverage.xml
Then you can enforce thresholds either via PHPUnit configuration:
<coverage> <report> <clover outputFile="coverage.xml"/> </report> </coverage> <logging> <log type="coverage-text" target="php://stdout" showUncoveredFiles="true"/> </logging>
Or by using external tools like SonarQube, which allow you to define quality gates that fail builds when coverage drops below a certain percentage.
The “Diff Coverage” Approach
One of the more effective strategies—especially in mature teams—is diff coverage.
Instead of enforcing coverage across the entire codebase, you enforce it only on new or changed code.
This solves a common problem: legacy codebases with low coverage. Rather than blocking progress, you ensure that every new line added is properly tested.
Tools like diff-cover , GitLab’s built-in coverage visualization, or SonarQube can help implement this approach.
It’s a small shift, but it changes team behavior significantly.
Where Coverage Falls Short
Coverage is often misunderstood because it’s easy to measure.
But what it doesn’t tell you is just as important:
It doesn’t guarantee meaningful assertions
It doesn’t ensure edge cases are handled
It doesn’t validate business logic correctness
A test that calls a method and asserts true === true will still increase coverage.
That’s why high-performing teams combine coverage with:
Code reviews focused on test quality
Mutation testing (to verify test effectiveness)
Static analysis and type checking
Coverage is a signal—but it needs context.
Making It Work in Real Teams
The most successful use of coverage in CI/CD isn’t strict enforcement—it’s gradual alignment.
Start by measuring. Then visualize. Then enforce lightly.
Over time, raise expectations as the team adapts.
A good pattern looks like this:
Introduce coverage reporting without enforcement
Add a soft threshold (warnings, not failures)
Transition to hard thresholds for new code
Gradually raise the bar where it makes sense
This avoids the common trap of teams gaming the system just to pass builds.
The Real Value
Code coverage isn’t about hitting a number. It’s about reducing uncertainty.
When a deployment goes out, you want to know that the critical paths—the things that matter most—have been exercised, validated, and protected against regression.
Coverage won’t tell you everything.
But without it, you’re flying blind.