Holdout Tests: 4 Common Pitfalls To Look Out For


Terence Einhorn, Sr. Solutions Consultant in Sales

Published 02/20/2024

Holdout tests are an essential tool that allows brands like yours to uncover the true incremental impact of your media, allowing you to understand true marketing performance and optimize your ad spend accordingly.

Some of our past case studies have highlighted vendor platforms that over-credit themselves in terms of sales they claim to be responsible for. Take a look at this excerpt from a case study featuring Johnny Was, for instance:

“Measured ran a holdout test, which withholds media from a group of strategically chosen test markets to determine the impact on sales. The first discovery was Google’s significant over-reporting: for every four conversions that Google took credit for, Measured found one. Google was taking credit for 4x more conversions than it actually caused." 

At Measured, we’re all about revealing the causal impact of your marketing efforts. While your marketing team may already be running holdout tests (we really hope they are!), it’s also very possible that they’re running into certain challenges while doing so. 

Here’s a list of some of the common pitfalls you may encounter when you’re running holdout tests.

Common Pitfalls in Holdout Testing

There are a few common pitfalls that often plague brands new to holdout testing.

Random Market Selection

The most rudimentary geo-market selection technique is to break a country into random cohorts of roughly equal size (50/50, 33/33/33, etc.). 

This creates a few problems: firstly, it exposes the business to an unnecessary amount of disruption by holding out such a large portion of the country (statistical or simulation selection techniques usually yield test cells of around 10% in sales volume).

Secondly, this tends to yield very volatile results as there is no guarantee of representativeness between two equal-sized cohorts of markets.

In short, more robust data science, or simulation techniques ensure that test markets are statistically representative of the broader region while comprising a minimal amount of volume.

Overly Precise Geographies

There is a temptation to run market selection at the most precise level possible (zip code level, etc.), believing this will yield more precise results and allow brands to test in a smaller portion of the country.

However, the opposite is true in both cases: Ad platform targeting capabilities are less accurate at more granular levels, so there tends to be more contamination at the zip code level than at the state level, for example.

This increased contamination means you would generally need to test a higher portion of the country to yield reliable results.

Being Overly Granular

Similarly, there is a temptation to test very granular levels of media (campaign level, ad set level, etc.) to glean higher-quality insights.

The truth is that a given media tactic needs to contribute a significant portion (generally more than 1% or 2%) to the OVERALL business to yield a significant test result. It’s very unlikely that a single campaign or ad within a channel is driving that amount of business and will likely yield inconclusive results.

Knee-Jerk Reactions

The most common pitfall is to neurotically calculate test results “as they come in” and then make knee-jerk reactions (like ending the test prematurely, adding new market exclusions during the test, etc.) due to not seeing expected results.

Test durations are selected for a reason: It is impossible to deliver reliable or usable results on just a few days' worth of test data. 

Other Considerations

While the above are the ‘big’ ones to look out for, you should also keep in mind other factors, some of which may be out of your control, such as competitor actions or changes in the market. 

Holdout tests are a necessary part of your marketing strategy, but finding a measurement partner who can help you navigate these pitfalls is a good way to gain meaningful insights into your media performance.

For a deep dive into everything you need and want to know about holdout tests, check out our FAQ here