Optimize your marketing with lead generation and conversion rate optimization platform Unbounce

How to calculate statistical significance for your A/B tests

Updated on January 15, 2025

9 min read

What do particle physicists and marketers have in common?

(Don’t worry, this isn’t the setup to a corny joke.)

Both spend a lot of time figuring out whether or not their test results are statistically significant.

If you’ve spent any amount of time researching conversion rate optimization (CRO) and A/B testing, then you’ve probably come across the term “statistical significance.”

The long and short of it is that you want to make sure your testing efforts are working, and that you’re making data-driven decisions based on your tests. That’s where statistical significance comes into play.

Understanding what statistical significance is and why it matters is key for getting the most out of your CRO efforts—but diving into the deep end of statistical analysis can quickly become overwhelming.

To help you better understand the role of statistical significance, probability, and how it all relates to your marketing tactics, we’ve put together this article.

You’ll learn:

What statistical significance is
Why it matters for your A/B tests
How to calculate statistical significance
Examples and ideas to help you get started

What is statistical significance?

In simple terms, statistical significance means that a test’s results are unlikely to be the result of chance or random occurrence.

Basically, your testing efforts have found a genuine, observable result—even if it isn’t the result you were hoping for. If we wanted to get more technical, though, then statistical significance is a measure of the probability that your test’s null hypothesis is true.

“Hang on, though, what’s a null hypothesis?”

A null hypothesis is a sort of counter-hypothesis designed to keep your tests as objective as possible. A null hypothesis doesn’t require as much careful thought as your test’s hypothesis—it simply says that your test’s results are not correlated to your activities until you can actually prove it.

In other words, a null hypothesis says that a test’s results are caused by pure, random chance.

Still with us?

We get it—that’s a ton of new terminology and jargon to wrap your head around. If you’re still scratching your head about it, it might help to contextualize statistical significance and its related terms in a familiar CRO setting.

Here’s a simple example:

Imagine you’re running an A/B test with two landing page variants.

Your hypothesis for your test is this:

“By changing the headline on my hero variant, I predict we’ll see a 5% increase in the page’s conversion rate, compared to the control variant.”

For this example, your null hypothesis could be something as simple as this:

“The changes made to the hero variant won’t have a measurable effect on the conversion rate, so any results we see can be attributed to random chance.”

As ridiculous as it might seem to say out loud or even put into words, a null hypothesis is nonetheless a necessary guardrail when you’re trying to determine statistical significance—even if you don’t believe what you’re saying.

Josh Gallant

Josh is the founder of Backstage SEO, an organic growth firm that helps SaaS companies capture demand. He’s a self-proclaimed spreadsheet nerd by day, volunteer soccer coach on weekends, and wannabe fantasy football expert every fall.

» More blog posts by Josh Gallant

What is p-value in statistical significance?

Statistical significance and null hypothesis aren’t the only important terms you need to know. If you’ve skimmed anything about these topics, then you’ve probably also seen some mention of “p-value.”

P-value is simply a way of expressing statistical significance, and may also be referred to as a “confidence interval.”

Formally, p-value is a statistical measurement used to validate a hypothesis against observed data, and it helps determine the probability that observed findings are the result of random chance. Generally:

The higher a p-value, the more likely it is that your results are due to chance.

The lower a p-value, the more confident you can be that your results are statistically significant.

While the exact figure may vary, it’s recommended that most tests aim for a p-value of 5% or less to increase confidence that your data is reliable.

Why is it important for your A/B tests to be statistically significant?

So now we’re getting into the meat and potatoes: Why is it important that your A/B tests be statistically significant?

There are four major reasons why you would want your tests to be statistically significant:

To validate your hypothesis

To ensure informed decision-making

To reduce uncertainty

To help you uncover further findings

Let’s take a closer look at each of these reasons.

Validating hypotheses

The most apparent reason for ensuring statistical significance is that it helps you validate the hypothesis you’re testing. Whether that hypothesis is correct or not is less important; statistical significance simply means the tests and data you’re gathering are valid and worth analyzing.

Informing decision making

Statistically significant data helps you make a strong basis for any decision making you undertake. Because you know that your data is reliable, you can be more confident in using it to guide and shape your strategy.

Reducing uncertainty

This one is closely related to your decision-making efforts, but bears repeating here—by determining your data is statistically significant, you’re reducing uncertainty and ambiguity in your testing and CRO processes. Having a strong foundation of good, reliable data is critical, as it gives you precedents you can point to and something you can rely on to explain your reasoning and put any doubts to rest.

Guiding further findings

Finally, statistically significant data is vital for helping you figure out next steps. This is a foundational part of your CRO efforts—we’ve written before about how even those A/B tests that don’t confirm your hypothesis shouldn’t be considered failures.

As long as you’re still gathering statistically significant data, you can learn from it and use it to formulate new hypotheses and discover new information.

How to calculate statistical significance

Now that we’ve covered all the background details and the reasons why statistical significance is important, let’s dig into how you can go about calculating statistical significance in your next CRO experiment’s findings.

Calculating statistical significance can be an extensive process if you’re just running the numbers by hand—thankfully, tools like Unbounce’s A/B testing tool help automate these processes so you’re not crunching the numbers yourself. But knowing how to make the calculations yourself and understanding exactly what you’re calculating can help you develop a better understanding to inform further testing.

Generally speaking, calculating statistical significance involves:

Create a hypothesis

Determine the significance level

Calculate ideal sample size

Collect data

Analyze data

Interpret results

1. Create a hypothesis

No matter what test you’re running, you’re going to need a hypothesis and a null hypothesis. Your hypothesis is just the typical sort of hypothesis you’d typically develop for any CRO experiment you run—and you probably already have some idea about what you’re looking to accomplish.

To get a bit more technical, your hypothesis should specifically be a clear statement that predicts a potential outcome following a change to a single variable, like we’ve written about before.

And remember, for A/B testing, you’re only changing one variable at a time.

Alongside your standard hypothesis, you’ll need a null hypothesis to help with your calculations. Like we mentioned earlier, your null hypothesis simply says there’s no significant difference between groups you’re comparing.

2. Determine a significance level

Next, you’ve got to determine the significance level. The significance level is a sort of threshold for the strength of evidence in your A/B test that must be gathered before you can reject your null hypothesis.

Basically, it’s the level of probability or risk that your results are due to chance—the other side of your test’s p-value. The significance level is sometimes referred to as the alpha.

In most cases, you’ll determine a significance level between 0.01 or 0.05, or 1-5%. This is a pretty common range for most significance levels. Remember, the lower the significance level, the stronger your evidence needs to be.

3. Calculate ideal sample size

As part of your efforts to determine statistical significance, you need to ensure you have the right sample size.

This can be a tricky process. Your best bet for figuring out the ideal sample size for your test is using Unbounce’s A/B Test Duration and Sample Size calculator—this will let you quickly punch in your goals and get useful target numbers to determine how long your test should run and how large your audience should be.

There are a number of variables involved in calculating sample size, so using a calculator is the easiest way to streamline what can otherwise be a convoluted process.

This is also a great opportunity to determine standard deviation (how varied the data is above the mean or expected data) and standard error (the standard deviation of the sample population).

Here’s the formula for calculating standard deviation:

Standard deviation = √((∑|x−μ|^ 2) / (N-1))

√ indicates we’re taking the square root

Σ represents the sum of the data

x represents individual data

μ represents the data’s mean for each group

N represents the total sample

And the formula for calculating standard error:

Standard error = √((s1/N1) (s2/N2))

In this equation, the variables are as follows:

√ indicates we’re taking the square root

s1 represents the standard deviation of your first group

N1 represents group one’s sample size

s2 represents the standard deviation of your second group

N2 represents group two’s sample size

4. Collect data

Next, you run your A/B test and gather your data. We’ve written a ton on the ins and outs of A/B testing elsewhere, so we won’t go into too much detail here. Just make sure any analytics tools you’re running are set up properly and that you’re sending your audience to the appropriate variants.

Make sure you’re collecting data from a random sample to avoid any bias that could influence your test’s results. Let the test run and get your data in order.

5. Analyze data

This is probably the most complex part of calculating statistical significance, and it’s going to be pretty heavy on the math.

There are a few ways to run a statistical analysis of your data, including:

1. Chi-square tests

This is a statistical test that can be used to determine if there is a significant association between two categorical variables. These tests can help you see if the results you observe are different from what you’d expect by chance, as in your null hypothesis.

Chi-square tests are one of the most reliable ways to analyze your A/B testing data. These are discrete tests that only generate a limited number of results, which helps focus your testing efforts and lets you draw more accurate conclusions.

2. Z-tests

These tests are used to determine if there’s a significant difference between the means of two groups when the sample size is large and the population variance is known. It’s often used for comparing proportions or means.

3. T-tests

T-tests are similar to Z-tests, but are used when you’re dealing with a small sample size or an unknown population variance. These tests compare the means of two different groups to see if there are any significant differences.

4. Analysis of variance (ANOVA) tests

ANOVA tests are used to compare the means of three or more groups to figure out if at least one group’s mean is notably different from the others. These tests are helpful if you’re comparing multiple groups at the same time, like in a multivariate test or A/B/n test (where the ‘n’ just means more than two variants).

6. Interpret results

Finally, you can interpret your data. This is the meat and potatoes of any A/B test, where you put your findings under the microscope and figure out what it’s trying to tell you.

If your results come back within that threshold, congratulations! Your test results are statistically significant and you can now start to use them to determine what actions you want to take on your landing pages.

You’ll also need to calculate the p-value. This is most often done via statistical software or referring to probability tables, and it’s important to remember that how you calculate your p-value will depend on the type of test you’ve run in the first place.

Once you have your p-value, you need to compare it to your previously established significance level.

Remember how we mentioned that, for most tests, you’d set your significance level between 0.01 and 0.05? If your p-value sits within that threshold of 1 to 5%, then you can safely assume your results are statistically significant.

But if your p-value is above the threshold you’ve set, then you can’t be certain your tests are generating reliable, valid data—which means you may need to run the test again or revise your testing parameters.

Statistical significance example

So we’ve covered a ton of information here, and statistics can be a bit daunting to try and tackle on your own. Here’s a simple example to put things in perspective.

Say you’re testing two versions of a CTA button on your landing page. You’ve settled on “buy now” as your button text but you want to see what design elements will help you drive conversions.

You make one variant page with a red button design (variant A) and one with a green button design (variant B). You A/B test your elements and get the following results:

Variant A had 1,000 visitors and 100 of them converted—a 10% conversion rate.

Variant B had 1,000 visitors, but 120 of them converted—a 12% conversion rate.

At first glance, you might think variant B is the clear winner and that you should be making similar updates across ALL your landing pages, right?

Not so fast.

These results might not be statistically significant—so you start analyzing the data more closely.

You start by calculating the difference in conversion rates—easy, 2%. You need to compare this variance to the standard variation you’d expect when comparing two groups. You run your statistical test, and determine that the probability of seeing a 2% difference is very low, coming in under 5%.

Because this p-value is below the significance level for your test, you’re confident that the results you achieved aren’t simply a stroke of good luck and that the green button on variant B is the cause of the higher conversion rate.

Don’t miss out on the latest industry trends, best practices, and insider tips for your marketing campaigns

Putting it all together

So with all that detail considered, you might be asking yourself, “Hang on—if there are tools out there that can calculate this information for me, why bother learning how to calculate it myself?”

The fact of the matter is that understanding the concepts behind those tools can help you get a better sense of why results are skewing one way or the other, and what sort of tests and CRO efforts will better serve your business going forwards.

Of course, you don’t need to do the long-form math by hand like this every single time, for every single test if you’re using the right platforms for A/B testing.

For example, Unbounce lets you run unlimited A/B tests and calculates the statistical significance of each test for you so you can accurately conclude tests at the right time with confidence.

Get started with a 14-day free trial today.

Lead generation and conversion optimization for marketing agencies

Marketing agency AI case study - 10X conversion rate

Explore our resource library

Get actionable insights, expert advice, and practical tips that can help you create high-converting landing pages, improve your PPC campaigns, and grow your business online.

Products and features

Landing pages

A/B testing

Smart Traffic

Products and features

Smart Copy

Integrations

Templates

More solutions by Unbounce

Insightly CRM

LeadsRx marketing attribution

Industries

Use cases

Flexible limits and extra support

Flexible limits and extra support

The essentials

What is a landing page?

Landing page examples

Conversion benchmark report

Marketing resources

Marketing resources

How to calculate statistical significance for your A/B tests

What is statistical significance?

TABLE OF CONTENTS

SHARE

Josh Gallant

What is p-value in statistical significance?

Why is it important for your A/B tests to be statistically significant?

Validating hypotheses

Informing decision making

Reducing uncertainty

Guiding further findings

How to calculate statistical significance

1. Create a hypothesis

2. Determine a significance level

3. Calculate ideal sample size

4. Collect data

5. Analyze data

6. Interpret results

Statistical significance example

SUBSCRIBE

Don’t miss out on the latest industry trends, best practices, and insider tips for your marketing campaigns

Putting it all together

Related articles

AI marketing

How this marketing agency uses AI to 10x their conversion rates

Landing pages

How Unbounce landing pages helped digital agency Webistry get more clients and conversions

Pay per click

7 best PPC agency software to optimize your workflows

Pay per click

How PPC agency ParaCore used clever account management to save a client $30k in ad spend

Marketing agencies

The agency guide to selling clients on the value of landing pages

General marketing

How to add landing page services to your agency offering (and price them)

Landing page examples

40 best landing page examples of 2024 (for your swipe file)

Lead generation

10 creative lead gen examples sourced from marketing legends

Lead generation

8 high-converting lead generation landing page examples to inspire yours

Landing page optimization

101 landing page optimization tips

Landing page optimization

How to score double-digit conversion rates—A marketing hero’s journey

Landing page optimization

10 landing page optimization case studies and examples to inspire you

Conversion optimization

12 must-read CRO case studies to inspire your next campaign

Explore our resource library

Landing pages

Conversion optimization

Digital marketing

Lead generation