The great testing pyramid of devops

Feb 7, 2022 / Guðni Ólafsson | 4 min read

Are you using the test pyramid correctly in software development? This simple concept needs careful consideration before you can actually get some value from applying it. Let's take a look at the The great testing pyramid of Cheops DevOps.

What is the test pyramid?

The test pyramid is surprisingly much like the actual pyramids. It looks very simple on the outside, but it is confusing and difficult to navigate on the inside; while not actually full of deadly traps.

At its simplest, it is just a heuristic. One for choosing where to expend effort in test automation projects. It has been floating around since 2009 in agile circles. And it states that the longer it takes to run tests, the fewer of those kinds of tests there should be.

To me, this is the clearest description of the test pyramid. A more concrete description is available on the same site. Mike Cohn introduced it in "Succeeding with Agile". Though I suspect more have heard of it from Google's testing blog.

There are a lot of posts heralding it as the new(ish) silver bullet for agile development. But there are also a lot of people who are less impressed. I happen to fall into that group. In this post, I aim to explain to you why.

What is the problem with it?

The first problem is of course that: "Unit tests ARE NOT."

They are not valuable tests from a QA perspective since they don't really find many bugs at all, and practically none of the most common types of bugs ... i.e. integration bugs. They need a lot of maintenance in the early phases of projects and are simply not suitable for some kinds of code. Now please don't lynch me yet. Keep in mind that what I think when I see the word unit-test is not the same thing as you. Not exactly.

The second, more obvious, problem is that most people are not talking about the same thing. We have wildly different definitions of unit tests and integration tests. Sometimes it seems that practically no one is using the same definitions. In fairness, making such definitions is a non-trivial problem. The lack of a consensus is the root of the problem. It is also the main reason why people still argue about many best/better/good practices.

The third problem is, of course: Wildly different needs for different teams. Depending on size, project type, and code-base... A Google-Best-Practice is almost certainly an antipattern in a smaller shop. They are optimizing for problems that the small organization will definitely never face. And don't have to worry about many problems that smaller ones will face.

Can we fix the problems?

People shouting at each other online because of different definitions is par for the course. Of course. It was upsetting when I realized that I did exactly that with the test pyramid. I never looked past my own definitions of unit and integration tests. Then I condemned the whole thing as silly, as a new silver bullet for agile consultants and evangelists to shoot their teammates with. It turns out that with suitable definitions of units and integrations the pyramid can make sense.

Now those definitions of unit-tests and integration tests are, to me, wildly inappropriate. But for purpose of making sense of the benefits of the pyramid they are dead on

Unit = code-only tests. Possibly testing multiple genuine classes interacting. Only testing non-trivial code (no getters, setter, or otherwise trivial functions).
Integration = System-level integrations: External dependencies and running services needed by the app.
E2E = UI-driven testing. Click, type, and get-text your way to glory.

My internal model of what those should be is very different.

Unit = One class (or one file, or one function). Test the contents in absolute isolation. To verify and document correct interface behavior.
Integration = ALL THE OTHER CODE-BASED TESTS. If you don't mock all the things you now have an integration test.
E2E = All the tests that require actual running software to be able to run.

With such wildly different schemas, the different views are not surprising. One man's Golden Pyramid is another's Triangular Pile of Refuse.

Words matter. I've had many heated discussions with people about the different types of tests. Discussions where the bottom line was that they had different definitions. Different from me, and/or different from their own sources. Before we adopt some amazing new practice we'd do well to make sure we're talking about the same things. The same thing goes for condemning our teammates for idiocy. 😊 Although... at QPR Software we're allowed to be a bit idiotic, every now and then. We deeply value learning and learning always starts from ignorance.

It‘s worth noting that I‘m considering API tests to be a kind of e2e tests (for servers the UI is just the API). And that I‘m intentionally not considering many other kinds of tests: tests such as database tests, 3rd party integration tests, performance tests, or load tests. They really don‘t fit into the pyramid, we consider them in totally different terms.

Finally

The right way to apply the test pyramid is to understand what the actual levels of tests are. What are the needs of your project for unit tests, integration tests, and e2e tests? Not all projects lend themselves well to unit tests. Not all projects can support traditional UI-based e2e tests. Projects have different needs with regard to quality and speed of execution. They have different resourcing.

Start by defining the actual words. What's a unit test in your context? What's an integration test or an e2e test?
In what ways do we have to group these tests together? When and how will we execute them?
Throw away the pyramid and find the right shape for your project.
Find the heuristics that fit your project, and figure out how to apply them.

What is it called when we do that? : Test strategy and test planning, with a pinch of test design... Who would have thought those three activities were too complicated to be reasonably depicted by a three-layer pyramid?

Find out more about how to adapt the pyramid in this blog post

Written by

Guðni Ólafsson

Guðni is a mathematician and senior test engineer with 15 years of experience automating tests and associated infrastructure. He's passionate about learning and improving, all the things. Yes. All of them.

The great testing pyramid of devops

What is the test pyramid?

What is the problem with it?

Can we fix the problems?

Finally

Guðni Ólafsson

You might also like

Process Intelligence Playbook: 6 High-Impact Use Cases for Automation, AI, and Compliance

Why Process Mining Must Live Inside Your Data Cloud

5 Takeaways from Snowflake Summit 2025 That Shape the Future of Business

Modern process mining in Snowflake Data Cloud

Process mining

Operational excellence

Strategy management

About us

The great testing pyramid of devops

What is the test pyramid?

What is the problem with it?

Can we fix the problems?

Finally

Guðni Ólafsson

You might also like

Process Intelligence Playbook: 6 High-Impact Use Cases for Automation, AI, and Compliance

Why Process Mining Must Live Inside Your Data Cloud

5 Takeaways from Snowflake Summit 2025 That Shape the Future of Business

Modern process mining in Snowflake Data Cloud

Sign up for our newsletter

Process mining

Operational excellence

Strategy management

About us