Pairing Up Usability Testing with A/B Testing

July 18, 2011

Traditional usability testing involves several steps: creating screeners, sending out recruitment email messages, scheduling sessions, creating test scripts, conducting the test sessions, consolidating the findings, and making design recommendations. Large-scale usability testing can run for months, which is a big investment in time, money, and effort.

One of the challenges usability professionals constantly face is showing the value of usability testing through quantifiable results. Convincing a client to invest tens of thousands of dollars in usability testing often requires some concrete numbers that explicitly tell what the return on the investment in usability testing will be. A client might say, “Yes, the usability testing will tell me how horrible the Web site is and how much users dislike filling out the forms on the Web site. So what?” From the client’s prospective, the real question is: How do I know the usability testing will help me meet my business goals—making more sales, lowering costs, and increasing conversions? Just saying that the user-satisfaction rate will go up is not enough.

Champion Advertisement

A report from Forrester titled “Need to Cut Costs? Improve the Web Site Experience” lists a few easy-to-measure indicators that reliably show the benefits of usability testing. These indicators include fewer customer-service calls about products and Web sites and shorter calls regarding complex issues. We can link changes in such indicators to improvements in a Web site’s user interface. However, there are many other factors that can lead to such changes—such as better product promotion, better support documentation, better internal communication, or simply a decrease in sales. As usability professionals, we need a clear way of showing what we’ve learned from usability testing, how our recommended changes have gotten implemented, and what positive outcomes have occurred because of these changes.

Challenges in Implementing Recommendations from Usability Testing

In an ideal situation, after a well-planned usability study, a redesign takes place, and all of a usability professional’s suggested design changes get implemented. However, in real-life situations, there are several challenges that impede such a result.

First, usability testing usually happens when a Web site is already fully functional and, from the client’s? perspective, is in relatively “good shape.” The reason a client wants to invest in usability testing is often to discover quick fixes for the current site. However, frequently, the issues usability testing identifies are neither easy nor quick to fix, which leads to a situation where nothing happens to remedy the usability problems we’ve found.

Second, depending on the amount of time and the budget available for usability testing, studies may involve different numbers of users. Often, the number of users participating in a study is a very small proportion of a larger customer base. Therefore, clients often question the accuracy of the results. “Sure, you heard from two or three people telling you about this particular problem during the testing. What if the thousand other people you didn’t get to talk to disagree with you?” Such thoughts might stop clients from taking full advantage of the test results.

Third, usability testing is qualitative in nature. You may get a finding like “Users are confused by this page and are not sure what to do to move forward.” This tells you that there are problems with the current design, but it’s up to the usability professional to interpret the users’ words and come up with design suggestions, which may vary from person to person. Different usability professionals might agree on the usability problem, but have very different suggestions for improving the design.

A/B Testing and Its Challenges

A/B testing lets you compare the effectiveness of two different versions of the same Web page. Usually a controlled portion of a page’s traffic goes to version A and the rest to version B, so different customers interact with different versions of the page. During A/B testing, you can collect data regarding key performance indicators (KPIs) such as conversion rates, enabling you to compare the results of the two versions. The reason big companies such as Amazon and Google are fond of A/B testing is simple: data talks. Conversion rates don’t lie. Either version A works better or version B does—as the numbers easily demonstrate.

However, achieving really good results with A/B testing involves a whole different set of challenges. First,?you need to develop ideas for alternative design directions for various pages. Is the layout going to be different? Should you try a different picture? Use a different font size? Although ideas may be flying about, there is not necessarily any guidance for how to come up with a winning alternative. What if layout A works better with picture B, but layout B works better with picture A? In cases where there is enough traffic, you can conduct multivariate testing to see which combination of elements works best. However, you can confront a real quandary when all of the alternatives you’ve tested work equally well or badly. It may be that none of the alternatives you’ve tested allow you to discover the factor that would have the most impact on the effectiveness of a page design, so the optimal solution remains undiscovered even after the A/B testing.

Next, in developing the different versions of pages, there may have been a lack of user input and, thus, the designs would have no foundation in user wants and needs. Version B often gets generated by playing around with multiple elements on a page. Without solid user input, such tactics result in generic solutions. Next time, you might run a test, expecting to repeat your successful results in another situation or context and get completely opposite results, because you’re testing with a different user base, who have different motivations, or are accustomed to different user interaction patterns.

Finally, A/B testing is quantitative in nature and, thus, lacks qualitative insights. Although A/B testing usually reaches a large audience, all it provides is a comparison of key performance indicators between two different versions. You might get positive results, but you don’t really know why you got those results or why users prefer one version over another version.

Using Usability testing and A/B Testing Together

The challenges inherent in each of these methods of testing Web pages made us consider pairing up the two methods. This is a perfect solution because these methods work really well together. Depending on the purpose of a project, you can take two different approaches to pairing up these methods of testing. If A/B testing is driving a project, you can do a small-scale usability study in preparation for the A/B test to enable you to come up with better design alternatives that have their foundation in user input. If usability testing is driving a project, you can easily prove the power of usability testing by doing one or more A/B tests of redesigns that you’ve based on the findings of your usability study. We have experienced great success in using these two methods creatively.

How This Works

Follow these steps when using usability testing and A/B testing together:

Conduct usability testing to collect user input and qualitative feedback. If A/B testing is driving a project, usability testing can be very informal and highly focused. It should focus on the page that carries the most important conversion. Just talking to a very small number of target users about their feedback on this page can give you tremendous insights regarding how best to craft your design alternatives.
Go through the findings from your usability study and come up with good design alternatives. The key is to prioritize your findings from the usability study and focus on improving just one page or one area at a time. Based on the usability findings for a particular page, there might be multiple design alternatives for that page; in which case, you may need to conduct internal and even external discussions for your team to come to a consensus. The key is to focus on one area at a time. Changing too many things on a page at the same time can make it hard to measure the results.
Once your team reaches a consensus, you can sketch, wireframe, and design the alternative versions, and push them live for A/B testing. You can use predetermined KPIs to compare the results.
Closely monitor the KPIs for the two alternative versions and compare their results.

The Outcome

If you generate alternative page designs for an A/B test based on usability testing, it becomes much easier to identify what components are increasing conversion rates than in traditional A/B testing, which takes a guess-and-check approach. Also, because of the quick turnaround A/B testing provides, you can readily confirm the value your usability testing has provided by validating its results. Thus, in combination, these two methods of user research are much more effective than either is on its own.

In UX Strategy | Usability Testing | User Research

Alfonso de la Nuez

July 21, 2011 1:31 AM

Hi Shanshan, great article. Are you familiar with unmoderated online usability studies? I totally agree that “One of the challenges usability professionals constantly face is showing the value of usability testing through quantifiable results.” That’s precisely why we created UserZoom. Also, in recent years, tons of tools have been developed to help overcome the same challenge. There are various articles about it in this magazine.

The key point I’d like to add is this: With unmoderated online usability studies you can run both usability and A/B tests at the same time and get the benefits of both. You can run 2 parallel studies with 2 different versions and have 2 samples of 100 hundred users go through each study. You can set up tasks to test key areas to measure conversion rates. After you collect the results, you can: 3427fe4b439666729158e914b458580b
On top of this, if budget and time allow, I’d highly recommend running a more qualitative usability test in the lab—either before or after—and compare results in each version. The combination of methods is the way to go, in my opinion.

So having said this, I’m not 100% convinced about the necessity of the A/B test.

Best, Alfonso

Shanshan Ma

July 22, 2011 9:56 AM

Thank you for your comments, Alfonso. I’m familiar with a couple of unmoderated online testing tools such as usertesting.com and UserZoom. Great products, by the way. These products definitely offer great benefits that moderated usability testing doesn’t.

Regarding your question, the purpose of adding A/B testing to usability testing is to clearly quantify the value of the testing, moderated or unmoderated. If you are already doing unmoderated, comparative tests in parallel, the purpose of adding a quick-and-dirty usability test beforehand is to help you discover winning solutions, instead of simply playing with ideas. I have found it very useful in my practice.

Pairing Up Usability Testing with A/B Testing

Challenges in Implementing Recommendations from Usability Testing

A/B Testing and Its Challenges

Using Usability testing and A/B Testing Together

How This Works

The Outcome

2 Comments

Join the Discussion

Shanshan Ma

Other Columns by Shanshan Ma

Other Articles by Shanshan Ma

Other Articles on User Research

New on UXmatters