The Deceptively Simple A/B Testing Mistake Quietly Killing Your Conversion Rates

By , May 20th, 2014 in A/B Testing | 6 comments
Marketing Baloney
Face it – the conversion marketing world is full of baloney. Here’s the beef. Image source.

Physicist Leonard Susskind closed his TED talk about fellow physicist Richard Feynman with advice on how we could really honor his late friend, who he said wouldn’t have enjoyed such an event: 

“[Get] as much baloney out of our own sandwiches as we can”

And let’s face it. The conversion marketing world is full of baloney. It’s easy to have a dozen A/B tests up and running in no time and believe you’re being a good, data-driven marketer.

But what if you’re holding the map upside down? Worse, what if that conversion lift you’re so chuffed about is actually working against you?

In this post I’m going to tell you about a common cognitive bias that regularly enables us to make poor assumptions with confidence and that can lead to disastrous results without us even knowing it.

I’m also going to tell you how to overcome the effects of these assumptions to increase the conversions that count and generate better quality leads.

If this sounds appealing to you, then read on.

Take the “Ass of U and Me” test

Imagine that you are a consultant to a small ecommerce business called Ropeburn (yeah, you’re a Janet fan).

Ropeburn sells custom nutrition and exercise programs to personal trainers. Its website receives 20,000 unique visitors per month, offers lots of free content and a few exclusive tips gated behind your landing page lead gen form. Let’s have a little pop quiz for this hypothetical situation:

You select a visitor who submitted a form at random. You determine that the visitor’s name is Joe. Joe exercises 1-3 times per week, has tried dieting in the past and has been working out for 3+ years.

What are the odds that Joe is a personal trainer?

a. Well below 50%
b. About 50%
c. Well above 50%

The answer to the question above is A. Well below 50%.

If that’s not the answer you got, don’t fret. Most people will overestimate the probability that Jim is a personal trainer.

But why?

Because we tend to ignore base rates. While this example simplifies the concept and translates it to web traffic, the principle stands: We focus on the specific information we’re given and lose sight of the bigger picture.

The big picture in this case is the proportion of people who are actually personal trainers.

There are 267,000 personal trainers in the United States, which is a lot until compared to the 49,933,000 gym memberships (discounting personal trainers) in circulation.

For the sake of the example, if we assume that the only two populations inhabiting the planet are gym-goers and trainers (and that they both spend equal amounts of time online), for every one trainer browsing the web, there are 187 gym-goers browsing the web.

The odds of a random visitor being a trainer? Low. The same rule generally applies to your target audience.

Why visitor makeup matters

Which metrics do you consider in your tests?

Most people will look at sample size, conversion/click-through rate and confidence interval. A handful may look at statistical power, length of run and check for errors across various browsers and devices.

What is often overlooked is the number of marketing qualified leads generated (or lead-to-MQL rate).

In other words, you fail to segment different populations in your A/B test. You focus on aggregate conversion rate (CVR) as a go-to metric because it’s easier. We take mental shortcuts all of the time – and this is one of them.

The metric you need to uncover is the conversion rate of your target population and what percentage of visitors your target population accounted for.

Failing to segment your traffic can lead to bad testing decisions, more unqualified leads, a lower conversion rate among your target audience, a higher conversion rate among your non-target audience and wasted time and opportunity.

Why relying solely on aggregate CVR is problematic

Imagine that you complete a test for Ropeburn with the following parameters:

Test goal: Increase form submissions on your landing page
Assumption: Traffic: 50% trainers, 50% gym-goers
Observation:

  • Baseline CVR: 12%
  • Confidence: 95%
  • Sample: 10,000 control, 10,000 treatment

Observed outcome: Variation A wins with 125% lift in CVR

Generally, you would accept this to be a successful test and implement Variation A.

However, you could be missing a whole host of scenarios, scenarios in which implementing Variation A is actually a bad idea. For example:

  1. Your non-target audience loves Variation A, while your target audience had no response. This scenario is not good; you won’t be losing any valuable leads, but you are going to put more bad leads into your CRM.
  2. Your non-target audience likes Variation A; your target audience doesn’t. Here, you’d be losing customers in addition to gaining more unqualified leads.

And these scenarios don’t even account for the fact that traffic is likely split unevenly across vistor segments.

Your traffic is coming from unequal vistor segments

This may seem obvious, but it’s all too easy to overlook.

As we’ve seen in the assumption test, it can be easy to make poor assumptions when faced with incomplete information. It’s easy to neglect base rates, or in this case, the fact that a random visitor is much more likely to be a gym-goer than a trainer. Traffic is not split evenly (i.e. 3:4 trainers to gym-goers), and so poor performance is even easier to overlook.

Equipped with this realization, we now need to understand our baseline CVR by segment.

Consider the ways in which a 12% CVR can be achieved:

A/B testing: aggregate CVR graph
Because traffic is not split evenly across visitor segments, aggregate CVR is not always a good indicator of performance.

As you can see in the image above, your baseline conversion rate is heavily influenced by the larger traffic segment (in this case, gym-goers).

What does this mean for the test?

Your control may convert 35% of visitors who are trainers and 4% of visitors who are gym-goers. Variation B may lead to a 189% lift in CVR wherein barely any trainers are converting while gym-goers are converting at almost 50%.

A/B testing: lift in CVR for wrong audience graph
If you focus solely on aggregate CVR, you may be optimizing for the wrong crowd.

The bottom line is that if you rely solely on aggregate CVR, you’re wasting time – your own, and that of your prospects.

Optimizing for customers, not visitors

Optimizing for visitors does not lead to the same outcomes as optimizing for customers.

When you optimize landing page design, ad copy and onboarding for your target segment, you contribute to Ropeburn’s bottom line and you increase the average value of each visitor. This helps grow your business. When you optimize for all visitors, you’re driving blind.

When Craigslist runs usability or A/B tests, it must segment its results.

From a glance, Craigslist’s apartments section caters first to consumers (people seeking keys to an apartment), then to creators (people providing keys to the apartment and creating listings).

A/B test: Craigslist example
Craigslist’s apartments section caters to consumers first, then creators.

If Craigslist ran blind tests with no segmentation, would fee brokers or spambots dictate design? What would that look like?

Solutions: Segmentation and awareness

According to Gleanster Research, “only 25% of leads are legitimate” — so you’ll want to pay close attention to the types of visitors participating in your experiment and their respective conversion rates.

Ultimately, you want to be creating appropriate barriers to entry; better-segmented outbound marketing can improve the odds that a visitor to your site is a member of your target population.

If you’re not sure how to go about segmenting your visitors, here are some suggestions to get you started:

1. Write copy that calls your audience out by name

Ever ask anyone someone what kind of music they like and have them reply with, “everything”? Sucks, right? If you include everyone, you include no one. If you’re speaking to men in their 20s, say so.

LinkedIn’s product page helps visitors identify if they’re in the right place by providing an industry, title and location of a customer.

A/B testing: LinkedIn example
LinkedIn uses industry, title and location keywords to indicate to the visitor whether or not they are in the right place.

Bombfell says what it is and who it’s for in 69 characters.

A/B testing: Bombfell Example
Bombfell’s concise description allows people to self-identify whether the service is right for them.

2. Use form fields effectively

Optimize the number and type of your form fields by revenue.

There is an inverse relationship between form fields and CVR; however, conversion rate isn’t everything. If lead count goes down, but lead quality goes up, that might be a good thing. If you have strong lead nurturing programs in place, you can get away with fewer form fields.

Consider the example below: before you can even draft your message to Tim Ryan, you have to prove your message will be relevant to him. If the internet is angry over a Tim Ryan decision, they’ll first have to lookup a 5- and 4-zip in his district to contact him.

A/B testing: Tim Ryan example
Tim Ryan’s contact form uses zip code form fields to ensure messages are relevant.

3. Measure landing page performance by revenue, not just conversion rate

Look at the effect of your experiments on pipeline and you will quickly realize whether you are optimizing for the right audience.

For example, imagine a scenario where landing page B has a conversion rate of 50% and drives 100 leads per month, and landing page C has a CVR of 25% and drives 50 leads per month. It’s faster to look only at conversion rate and conclude that B is the winner than it is to look at revenue.

However, when we look at revenue, we may find that C drives $500/month while B drives only $300/month. C brings in $10 per lead, while B brings in $3 per lead and a lot of extra time sifting through bad leads.

If you don’t have a great understanding of your visitors, you must rely on revenue. If you use SFDC, Bizible makes this a cinch.

4. Pay attention to your visitor segments and their respective conversion rates

There is no universal right way to segment your visitors. It depends on your company and your goals. Seamless, a fast-growing online food service, may need to segment by channel, location, age and weather (think about it – you’re more likely to order delivery on a rainy/snowy day). Another company may need to segment by content type, device and affiliate name.

Start with your best guess, then add or subtract from there. It’s better to capture too much data at first than too little.

A/B Testing: MixPanel dashboard
MixPanel allows you to segment your users so you can get more actionable insight from your data.

You can use MixPanel to better segment your visitors and setup your desired conversion funnels. For example, you can record visitors’ initial referrer (something that can be quite challenging in Google Analytics), segment by a range of baked in properties, or write your own script to fire set custom properties.

If all else fails: Cunningham’s Law

“The best way to get the right answer on the internet is not to ask a question, it’s to post the wrong answer” – Ward Cunningham, developer of the first Wiki.

Seek to know your known unknowns with vigilance and, of course, with the help of the internet.

It keeps you honest, and it goes a long way.

– Vincent Barr


About The Author

Photo of Vincent Barr

Vincent Barr is curious. Marketing at MongoDB. VincentBarr.com
» More blog posts by

Comments

  1. The quote by Cunningham is great. I always found A/B tests to be difficult and ultimately a waste of time. That’s just me though, I like going by trial and error.

  2. Megan Bush says:

    I completely agree that variations tend to perform completely differently for various customer segments. The challenge is figuring out what your customer segments are, the percentage make-up, and figuring out how to separate the segments while testing prove to be the difficult/time consuming part of testing that many people don’t make the time to do!

    • Vincent Barr says:

      Exactly.

      Some platforms make it easier than others to start segmenting traffic (be it by channel, device, time of day, etc.), but that feature is still far enough under the hood that most people either a) don’t know it exists or b) don’t feel moved enough to use it.

  3. John Smeth says:

    Vincent it was very useful for me.

x
Get actionable optimization tips delivered straight to your inbox.

You'll learn:

  • What it takes to build successful marketing campaigns
  • Why your landing page design and copy might be working against you
  • How to increase conversions while delighting leads and customers