Minimum data is the smallest data set you should use before calculating statistical significance.
If your ad tests don’t have enough data, then you shouldn’t pause ads or make adjustments based upon the data since there is a high likelihood any differences you see are due to chance and not actual patterns within the data.
For example
For instance, this is a test result after 97 impressions:
Ad | Impressions | Clicks | CTR | Confidence |
Control | 40 | 1 | 2.5% | |
Ad 2 | 33 | 5 | 15.15% | 97.03% |
Ad 3 | 24 | 0 | 0% | 15.57% |
In purely math terms, we do have a 97% confidence that ad 2 will be a winner. If this was a static data environment where the data that comes later is similar to the data that came before, we might take an action. However, search is a dynamic environment and it’s obvious that 97 impressions is not enough data (although, any online calculator will tell you it is).
Here’s the exact same test after 3163 impressions:
Ad | Impressions | Clicks | CTR | Confidence |
Control | 1023 | 23 | 2.25 | |
Ad 2 | 993 | 29 | 2.92% | 82.9% |
Ad 3 | 1147 | 56 | 4.88% | 99.96% |
In this case, all of our ads have almost 1000 impressions, and we’re 99.96% confident in our winner (a different winner than at 97 impressions) and thus we can be confident that we can take actions on CTR based testing at this point in time.
At the low data levels, what you really want to avoid is having just one or two people significantly affect your data. For instance, if you have 100 impressions and 1 click, then you have a 1% CTR. If the next 2 people click your ad, your CTR goes from 1% to 2.91% CTR; which is a huge change and can completely affect which winning or losing ad you would have chosen.
When the data starts to grow, then you want to ensure that you have a sample size that is large enough so that a small percentage of searchers can’t significantly affect your data, which is why you want a larger and larger sample size the more impressions that an ad test generates within a given time frame.
Part of a minimum data consideration that does not exist within the realm of purely mathematical analysis is the variance of time.
For instance, imagine these three scenarios:
- You just got to work on a Monday morning and start to search for work related items
- It’s lunchtime on the last Monday of the month and your rent is due soon and you want to figure out finances
- You’re relaxing after dinner on a Monday and you remember something about your day and you want to search more about that item
That’s just Monday, yet your Monday evening search probably happened on a mobile phone and your conversion is going to be sending yourself a work reminder to examine the result on Tuesday morning. Your Monday morning conversion was likely to be a whitepaper download or a phone call.
Now, that’s just Monday. If you were searching for vacation cruises, your Monday search was thinking about how much you want to escape the office and that same search on a Saturday afternoon might be planning with the spouse on a cruise vacation you plan to buy.
As timeframes change, so does search behavior – this is why we need to take into account not just the data, but the timeframe of the data. You should always use a minimum of a week of data. However, it is fine to use a month or even three months of data gathering before you take action.
When determining minimum data, there are two considerations:
- What is your testing metric
- How much data do you generate each month
What is Your Testing Metric
We need to determine the testing metric before you know what data points to define.
As an example, if you are testing by CTR, your conversions don’t matter since CTR doesn’t use conversion data in its calculation.
Most metrics have both a required data point (as that’s the opportunity) and a secondary data point (action) used to calculate that metric.
For instance, click though rate is the ratio of impressions to clicks. You must have an impression to get a click. Thus impressions are mandatory but clicks are optional.
In some cases, you might not want to define the optional metric. For instance, let’s say we’re running two ad tests with this data set:
- Ad 1: Impressions 1000, clicks 100
- Ad 2: Impressions 1000, clicks 10
In this test, we’re confident that ad 1 is the better ad and has achieve over a 90% confidence interval. However, if we defined a minimum of 25 clicks, we’d still be waiting for results since ad 2 hasn’t hit that number yet. When you define the optional data points, you might wait longer to achieve results if one of your tests is significantly below average (in this case a 10% vs 1% CTR).
With minimum data, every ad in the test should hit the minimum data before you look at the information – not the test combined. As there are many ad rotation options, which we will cover in the future, it is common that not all ads within a test have the same opportunity, and thus each ad should meet the minimum requirements before you examine your confidence levels.
As timeframe is highly important to any test, all metrics should be using a timeframe minimum of at least a week; although, using monthly data works just as well.
Here’s the minimum data that you should define by testing metric:
Metric | Impressions | Clicks | Conversions | Timeframe |
CTR | Yes | Optional | Yes | |
CPA | Yes | Yes | ||
Conversion rate | Optional | Yes | Yes | |
CPI | Yes | Optional | Yes | |
ROAS | Yes | Yes | ||
RPI | Yes | Yes |
How Much Data Do You Generate each Month
We’re often asked to suggest minimum data amounts. There are times I’m hesitant to give out numbers because not everyone should be using the same numbers.
If you have a brand term that is searched 1 million times a week, you should be using at least a million impressions as your minimum. For many brands, they aren’t searched 1 million times in a year, and should be happy with 10,000 – 100,000 impressions before they examine their confidence levels.
These are MINIMUM DATA recommendations. It is OK to use higher numbers than these.
Minimum Data Recommendations for Most Companies:
Impressions | Clicks | Conversions | |
Low Traffic | 350 | 300 | 7 |
Mid Traffic | 750 | 500 | 13 |
High Traffic | 1000 | 1000 | 20 |
Well-known brand terms | 100,000 | 10,000 | 100 – 1000 |
As your campaigns are often segmented by brand, product terms, long tail, etc – the ads within each campaign can generally use the same minimum data. You will often use different metrics, minimum data, and statistical significance factors for different parts of your account.