Factors That Can Skew A/B Test Results
Conversion rate tests are a crucial way to understand how well a particular website layout will convert compared to a different layout.
One of the primary tests used during analytics is the A/B test, which means that a particular layout, A, is being compared to a different layout, B. Usually, the hypothesis is that B will perform better by producing a higher conversion rate. Many test visitors are created to visit each site, and the conversion rate is determined by judging the combined effectiveness of each aspect of the site layout.
Conversion rate tests are used to trial a particular layout before sending it live, but these tests are only as accurate as the software being used to run them is programmed to be. Even when using industry standard software, however, there can still be large differences between the test results of a particular layout and the actual real-world results obtained.
There are several common, easily-remedied reasons why a test might not give accurate results. Tests should be run until fully completed - don't peek before they are done. They should also be run to a 95% confidence level. Even when these factors are taken into consideration, however, there can still be a significant difference between test results and actual results.
A major factor contributing to this problem was uncovered last year on the Distilled blog, an online marketing website. They discovered that when different types of traffic have different conversion rates, the standard A/B test is often unable to account for this difference.
In their situation, visitors from email had a much higher conversion rate than their referral or unbranded traffic. The A/B test being used was allocating visitors from all sources randomly to the A and B layouts, but the test was not being run long enough for this random variation to become a truly representative sample. In other words, layout A would get more email traffic then B, or vice versa.
The same problem was discovered on the Cork Wallis blog. When using an A/A test to test the same layout twice, different conversion rates were obtained for each version of A. The two layouts were the exact same and so should have given the same results.
There is a good two-fold solution to this problem. The first part is to run A/A/B/B tests instead of the basic A/B. The second part is to run the test for longer than normal.
By running the test longer, each layout will receive a representative sample of traffic. Using an A/A/B/B test provides a way to ensure that representative samples were achieved - each version of A should give the same results, and each version of B should give the same results. If the test passes this checkpoint, the conversion rate test results will be much more accurate.
A/B Testing at Scripted
At Scripted, we've used a number of testing services and have found our A/B tests both useful and inconclusive. When looking at user behavior and landing page performance, it's easy to get caught up in small changes. Do you catch yourself asking questions such as "If I make this color orange instead of blue will it lead to more conversions?" Or, "If I change the copy to 'Get Started' instead of 'Buy Now' will it improve CVR?"
The answer for most companies is, "we're not sure." Even if you A/B test those hypotheses, most websites won't have enough traffic to have statistically significant results. Although colors and text are fun to change, landing page testing can be simple. At Scripted, we find ourselves testing where we direct our users, not what they say. The biggest impact you can have is by changing the number of CTAs on your page, or finding out what's missing on your page. Also, when testing landing pages, only change one major item at a time. Changing too many buttons, CTAs, colors, etc will make it difficult to identify what was the reason behind user behavior.
Great A/B testing resources: