Usability Testing Versus Expert Reviews

By Janet M. Six

Published: October 19, 2009

Send your questions to Ask UXmatters and get answers from some of the top professionals in UX.

In this Ask UXmatters column—which is the first in a series of three columns focusing on usability—our experts discuss the use of usability testing versus expert reviews. In the upcoming columns, we’ll discuss what usability techniques to use when money or time is tight and how to best conduct remote usability testing.

Look to Ask UXmatters for answers to your questions about user experience matters. If you’d like to see our experts’ responses to your own question in an upcoming edition of Ask UXmatters, please send your question to: ask.uxmatters@uxmatters.com.

Q: Under what circumstances is it more appropriate to do usability testing versus an expert review? What are the benefits and weaknesses of each method that make one or the other more appropriate in different situations?—from a UXmatters reader

The following experts have contributed answers to this question:

  • Todd Follansbee—User Experience Architect and Lead Consultant at Web Marketing Resources
  • Mike Hughes—User Assistance Architect at IBM Internet Security Systems; UXmatters columnist
  • Tobias Komischke—Director of User Experience at Infragistics
  • Stephanie Rosenbaum—CEO of pioneering UX consultancy Tec-Ed, Inc., and a charter member of UPA
  • Jim Ross—Principal of Design Research at Electronic Ink
  • Whitney Quesenbery—Principal Consultant at Whitney Interactive Design; Past-President, Usability Professionals’ Association (UPA); Fellow, Society for Technical Communications (STC); and UXmatters columnist
  • Paul Sherman—Principal at Sherman Group User Experience; Vice President of Usability Professionals’ Association; UXmatters columnist
  • Kyle Soucy—Founding Principal at Usable Interface
  • Daniel Szuc—Principal Usability Consultant at Apogee Usability Asia; Founding Member and President of the UPA China Hong Kong Branch

Determining Your Approach to Usability Evaluation

“Plan for this up front with your client,” suggests Daniel. “The main drivers for deciding when to use any research approach include the following variables—to name just a few:

  • time
  • budget
  • stage in development
  • how quickly the team needs results
  • functions you are evaluating
  • what you and the product team want to find out
  • I’d be interested in hearing what your drivers are…
“We should view all user research methods as ways to help us make ongoing product enhancements and feel free to mix and match methods as needed.”—Daniel Szuc

“The best approach also depends on your own standing on a team—whether they view you as the expert—and how well you can back up your findings from an expert review. For example, are you basing a design change on your own opinion, years of experience in their business domain, years of experience in usability or user experience, customer knowledge, user interface best practices, design patterns, or competitive analyses? Can you map a design change to projections on how it can help the business?

“We should view all user research methods as ways to help us make ongoing product enhancements and feel free to mix and match methods as needed. Unfortunately, too many still see usability testing as a final stamp of approval or the sole opportunity to hear from users or embed usability into the development process. We have to find ways to change this perception in industry.”

“We need to remember that expert review is a user-free method,” replies Stephanie. “Regardless of the evaluators’ skill and experience, they remain surrogate users—expert evaluators who emulate users—and not typical users. The results of expert review are not actual, primary user data and should lead to—not replace—user research.

“We need to remember that expert review is a user-free method. Regardless of the evaluators’ skill and experience, they remain surrogate users—expert evaluators who emulate users—and not typical users.”—Stephanie Rosenbaum

“Real users always surprise us. They often have problems we don’t expect, and they sometimes breeze through where we expect them to bog down. Also, expert review rarely emulates all the key audience groups, and it doesn’t tell us which problems users will encounter most often.

“Another concern about expert review is political: Development and marketing teams often have strong design opinions. The results of an expert review can sound like just another opinion to them—although our UX perspective can help us be a tie-breaker.

“In contrast, the behavioral data—often including metrics—of usability testing is reassuring to corporate managers, especially in engineering-driven organizations. Usability testing also has a strong psychological benefit for the observers. If developers can watch people having problems using their site or product, this experience is often more convincing than the opinions of UX professionals, however similar.

“In an ongoing usability evaluation program, the best approach could consist of two phases: expert review followed by usability testing. Expert review harvests the low-hanging fruit. We can make improvements immediately, so our test participants don’t spend half their sessions struggling with the same obvious usability problems. Plus, these problems can mask other equally important issues that we would have found if we had already addressed the problems we identified during an expert review.

“Here are some things to consider when choosing between these methods:

  • schedule—If your time is extremely limited and you have experts readily available, expert review can yield faster results than usability testing.
  • budget—Similarly, a modest expert review costs less than a modest usability test.
  • impact and validity—If no other user research will take place before a product’s release, conducting only an expert review puts you in a risky situation. If you don’t test with users before release, you will be testing with customers after release.
  • corporate culture—How much does your organization value expert advice? Will they act on it? If you’re in doubt, make a greater effort to find the time and budget for usability testing.”

Complementary Approaches to Usability Evaluation

“An expert review … is a nice way for you to get familiar with a product, a business, and its strategy and to identify both positives and opportunities for improvement you could later explore further through other research like usability testing.”
—Daniel Szuc

“An expert review—sometimes called a usability review—is a nice way for you to get familiar with a product, a business, and its strategy and to identify both positives and opportunities for improvement you could later explore further through other research like usability testing,” explains Daniel. “I find expert reviews are excellent for identifying user interface roadblocks that may be interfering with the overall user experience. However, expert reviews do not involve users, which is a disadvantage at times. They are not necessarily good at getting into a user’s mind or understanding whether what you’re presenting matches a real user need or the product is the right one to start with. This is where usability testing can help.

“Usability testing’s main advantage is hearing the voice of the customer—quotations, frustrations, sighs, needs, and suggestions for improvements. You can better understand whether your product meets a user’s expectations through usability testing. Unfortunately, some people think usability testing is something that is done at the end, on a finished product, when you should embrace it right along with design and development—always testing your assumptions and seeing how you can enhance a product along the way.

“You should use what you have learned during your expert review as input for your usability testing. I often see situations where usability professionals test parts of a product with users that an expert review based on best practice could better inform. This is not a good use of the users’ time. We should value time with users as gold. So it is better to address strategic user interface roadblocks or framework issues during usability testing with users and address granular user interface issues with an expert review. We don’t have to test everything with users.”

“An expert review can be more thorough and evaluate more parts of a user interface than in usability testing, finding a greater number of problems, because testing is usually limited in time and scope, focusing on certain tasks and parts of an interface.”—Jim Ross

“Usability testing and expert reviews are both helpful and tend to find different issues,” responds Jim. “In the ideal situation, they would both be used for the most comprehensive analysis of the usability of a user interface.

“Expert reviews are especially useful for finding violations of usability standards and best practices. These are often obvious problems that may or may not cause problems during usability testing. For many of these types of issues, usability testing is not necessary to find them or to confidently say that they are a problem. For example, inconsistencies between links and page titles, underlined text that is not a link, difficult to read text, contrast issues, and accessibility issues are a few of the problems that an expert review can easily find. An expert review can be more thorough and evaluate more parts of a user interface than in usability testing, finding a greater number of problems, because testing is usually limited in time and scope, focusing on certain tasks and parts of an interface.

“Usability testing is better suited to finding the big issues—problems that affect users trying to perform tasks. Even the best usability expert is not part of the target audience for an interface, so he or she cannot predict all of the problems users will face. In addition to the obvious problems that an expert review finds, there are often issues that a usability expert can only assume might cause problems. These types of issues require usability testing to determine whether they are actually problems for users. Plus, there are always problems that a usability expert—not being part of the target user group—cannot find. Usability testing almost always reveals new insights that no one from a project team ever considered.

“There are always problems that a usability expert—not being part of the target user group—cannot find. Usability testing almost always reveals new insights that no one from a project team ever considered.”
—Jim Ross

“Before doing usability testing, it is helpful to do at least an informal expert review to determine what to focus on during testing. You can do an expert review to find the obvious problems, allowing usability testing to find and validate the more important problems.”

Mike advocates usability testing and says, “Usability testing provides more authentic data than does an expert review.” However, he also sees usability testing and expert review as complementary approaches. “You are seeing users interpret a user interface in the context of trying to solve a real problem or conduct a typical task. Expert reviews are more appropriate for an initial assessment of how well a design or design team is following the accepted best practices of user interface design. I recommend doing an expert review early in a project, when the management question is essentially In general, how well does it look like we’re doing? Later in a project, I consider usability testing to be the best way to answer the question How effective is a particular user interface in supplying a successful and satisfying user experience for a specific context?

Heuristic Evaluations Versus Expert Reviews

“Experience has proven that—even without seeing the different perspectives of users and sometimes having limited domain knowledge—we can evaluate a system and provide actionable recommendations toward better usability.”
—Kyle Soucy

“The benefits of heuristic evaluations cannot be denied,” asserts Kyle. “My eyes were really opened to their benefits after participating in one of Rolf Molich’s Comparative Usability Evaluations (CUE) on the Ikea Web site. During this study, the person who actually found the most usability problems was the one who conducted a heuristic evaluation.

When I first started out as a usability professional, I didn’t put much stock in heuristic evaluations. I thought it was pompous for us to assume we knew better than users what was usable. I guess that means I didn’t put much stock in us as usability professionals. But, now, I believe experience has proven that—even without seeing the different perspectives of users and sometimes having limited domain knowledge—we can evaluate a system and provide actionable recommendations toward better usability. That being said, I still don’t believe a heuristic evaluation is a replacement for usability testing.

“We need to ask ourselves this important question: Should we consider something a usability problem only if it violates a usability principle? Personally, I don’t think so. There are times, when I’m evaluating a user interface, that I run across a problem that doesn’t violate any of the known heuristics. So, we can’t solely rely on heuristics to tell us if something is or is not usable.

“What we do is a craft, and our own unique perspectives and experiences are what make our evaluations insightful and valuable. For this reason, it does make a difference who you hire.”—Kyle Soucy

“Many have criticized heuristic evaluations and other methods of usability evaluation for not providing reproducible results. For example, two people evaluating the same user interface can—and typically do—have different results. The question is Does this matter? Some people want a longer and more explicit list of heuristics, so we can try to codify usability evaluation and make it easier for anyone to review an interface and provide the same diagnosis. Essentially, this would try to turn what we do from a craft into a science. Although a lofty goal, I don’t think this is possible. I’m all for making usability more scientific, but I don’t believe defining more heuristics is the way to do it.

What we do is a craft, and our own unique perspectives and experiences are what make our evaluations insightful and valuable. For this reason, it does make a difference who you hire. I think the fact that our work is not reproducible is a good reminder that we always need more than one pair of eyes evaluating our designs. No matter what method you use, you cannot achieve usability when working in a vacuum.

“I prefer to conduct expert reviews rather than heuristic evaluations for three reasons:

  1. A system can comply with all of the heuristics and still be unusable.
  2. I don’t think it’s possible to list all of the heuristics, especially as our products become increasingly more complex.
  3. Heuristics are not client friendly.

“Some UX professionals tend to have a poor habit of speaking their own language rather than the language of business. I constantly see heuristic evaluations that label problems by the heuristics a user interface violates rather than expressing problems in easily understandable terms. For this reason, if you’re going to conduct heuristic evaluations, I’m a big fan of creating your own heuristics that are specific to your product and audience.”

When Do You Really Need Usability Testing?

Paul recommends doing “usability testing in these situations:

  • when you absolutely, positively have to get the workflow right
  • when others in your organization—the designer, the product manager, development, management—need convincing
  • when you don’t have a complete, detailed, and validated user model
“The biggest benefit of usability testing is that it takes design criticism out of the realm of opinion and puts it into the realm of data.”—Paul Sherman

“In other words, in most situations. The biggest benefit of usability testing is that it takes design criticism out of the realm of opinion and puts it into the realm of data. When a product team treats a usability test as a group activity—observing and discussing the problems users encounter—it facilitates the attainment of a shared vision among the team members.”

Despite the value Kyle sees in expert reviews, she agrees, “I prefer to conduct testing rather than evaluations for one reason: Not because the results are better, but because, without testing, it is always just our opinion. My clients can argue with me, but they can’t argue with their users and customers.”

Testing to Assess Users’ Emotional Response

“To move a site from being very usable to being persuasive, you need to do usability testing to understand users’ emotional response and judge a site’s persuasiveness.”—Todd Follansbee

“If you have not yet had a usability expert directly involved in a project, start with an expert review,” suggests Todd. “If a site does not meet usability guidelines—and it is highly unlikely that it would meet guidelines without a UX expert in the mix—direct usability testing would likely just reveal barriers, and you would not have the chance to explore the deeper persuasion issues that testing can show. An expert review would reveal these same barriers for far less cost.

“At their simplest, sites evolve from being functional—the links, shopping cart, or forms all work, and a site functions in most browsers—to usable—meaning a site meets a high percentage of usability guidelines—to persuasive—a site’s compelling content motivates sales, consensus, referrals, and more. Since buying is primarily an emotional decision, based loosely upon facts, to move a site from being very usable to being persuasive, you need to do usability testing to understand users’ emotional response and judge a site’s persuasiveness. I am quick to add that I don’t know is this belief is widely accepted among usability professionals, and I am not sure it is part of a typical curriculum.

“A usability test can demonstrate—through the eyes of a range of participants—users’ emotional response to the brand, statement of business purpose, graphics, long- and short-term messaging, competitive position, sales path, and more.”—Todd Follansbee

“A usability test can demonstrate—through the eyes of a range of participants—users’ emotional response to the brand, statement of business purpose, graphics, long- and short-term messaging, competitive position, sales path, and more. Good, direct testing focuses on finding solutions, in addition to revealing problems, by soliciting from test participants how they might resolve an issue that arises. Most expert reviews offer only one viewpoint. Another advantage of direct usability testing is that stakeholders seem to respond more strongly to videos of test participants’ frustrations with a friction-laden, unnecessarily long form than a screen shot and description of issues with the same form in an expert review. If it is on the small screen, right or wrong, we tend to accept it as fact.

“In summary, if you have had expert UX input in a site’s development, you would have had an expert review and are ready for direct usability testing. If you have not had expert UX advice, your site is likely to fail—like every site we have tested under those circumstances—to meet basic usability guidelines. Do a less expensive expert review to identify and fix basic problems, then go for the final, persuasive elements with a usability test. I know of no other way to measure the effectiveness of the application of persuasion psychology principles other than through direct usability testing.”

There Is No User-Centered Design Without Users

“You need a way to get the user perspective into your thinking. That’s the real value of usability testing as part of a design process.”—Whitney Quesenbery

“If you must do an expert review, you should make sure that your approach includes the perspectives of users,” recommends Whitney. “Otherwise, it may be a good design checklist, but may not get at the real usability issues in the work. For example, look at the research-based guidelines on usability.gov. They give each guideline two ratings: one for the strength of the research evidence supporting the guideline and another for how important it is in enhancing usability—or preventing usability problems. One of the interesting and subtle points about these ratings is that there are some guidelines for which there is a lot of evidence, but that aren’t rated as very important.

“This is not to say that expert opinion is not important. Of course, it is. Especially if that expertise is based on much actual experience—as in Malcolm Gladwell’s Blink and Outliers. But, if you want to understand how someone who is not an insider to the design process would react to a product, you need a way to get the user perspective into your thinking. That’s the real value of usability testing as part of a design process. Even a few test participants will do.

“As Steve Krug says, ‘It’s not rocket surgery.’ What you are after is not just opinions, but insights into how different people would react to and interact with a product. All you need is 5 or 6 people, for 1 hour each, in 1 day of usability testing. Have your team observe the tests, discuss what you saw, and decide how to use your insights to improve the design. If you aren’t sure what something means, repeat your test.

“The biggest myths are that usability testing is hard, users are difficult to reach, you need many participants for a formative—or diagnostic—test, and you need a formal report.”—Whitney Quesenbery

“The biggest myths are that usability testing is hard, users are difficult to reach, you need many participants for a formative—or diagnostic—test, and you need a formal report.

“Now, if you really, really can’t do anything but an expert review, at least let your personas guide it. Use your personas—or other user research—as a lens through which to examine the design. What are users’ top reasons for using an application? What would they do first? Ginny Redish, Dana Chisnell, and Amy Lee did a large project at AARP using this technique—based on a lot of user research. Caroline Jarrett and I have done a presentation on this approach: ‘Conducting a User-Centered Expert Review.’

“What’s really amazing is how well it works. I’ve used it with a room full of local government Web managers with no usability experience. Give them just one persona as a way to look at a site and their view changes—in five minutes. Once they see there’s more than one way to experience a site, a lot of interesting conversations break out. And those conversations are the whole point.”

“I recommend always doing both.”—Tobias Komischke

“I recommend always doing both,” encourages Tobias. “Expert reviews that are performed by specialists, using standards and heuristics, reveal easy-to-catch usability problems in a very cost-efficient way. By easy to catch, I mean things you can spot by checking against guidelines—for example, Does the user interface speak the typical language of the user? Experts cannot be as familiar with a specific context of use as users, so they investigate more on the surface. By cost efficiency, I mean you can do an expert review quickly and don’t have to recruit actual or future users or incentivize them to participate in a test. Also, the analysis and compilation of results is typically not as extensive as for usability tests.

“Users have mental models, experiences, and expectations that are true and, thus, different from those of the UX professionals who carry out expert reviews. This authenticity makes usability tests more valid and valuable.”—Tobias Komischke

“Usability testing—where you observe users while they try to solve authentic tasks using a system or concept that is under review—gets into much more depth and can reveal more serious problems. Users have mental models, experiences, and expectations that are true and, thus, different from those of the UX professionals who carry out expert reviews. This authenticity makes usability tests more valid and valuable.

“While usability testing is more powerful than expert review, both methods in combination are great, because you first want to discover the low-hanging fruit and get them out of the way. For that, it’s more cost effective to use expert reviews. Then, you can save your precious interaction with real users for discovering the harder-to-find issues. So my answer is this: First, do expert reviews, then do usability tests. This should be possible on every project. There’s no good excuse for not doing both. If someone actually holds a gun to your head and forces you to decide on one, choose usability testing. There’s no user-centered design without users!”

Resources

Chisnell, Dana. “Testing in the Wild, Seizing Opportunity.” User Interface Engineering, August 12, 2009. Retrieved October 13, 2009.

Gaffney, Gerry, and Daniel Szuc. The Usability Kit. Melbourne: SitePoint, 2006. Retrieved October 13, 2009.

Molich, Rolf. “CUE - Comparative Usability Evaluation by Dialog Design.” DialogDesign ved Rolf Molich. Retrieved October 13, 2009.

Quesenbery, Whitney. “Choosing the Right Usability Technique: Getting the Answers You Need.” User Friendly 2008. Retrieved October 13, 2009.

Quesenbery, Whitney, and Caroline Jarrett. “Conducting a User-Centered Expert Review.” STC Proceedings, June 2, 2007. Retrieved October 13, 2009.

— —. “Letting Participants Choose Their Own Tasks.” STC Summit, 2009. Retrieved October 13, 2009.

Rosenbaum, Stephanie. “Not Just a Hammer: When and How to Employ Multiple Methods in Usability Programs.” UPA 2000 Proceedings, 2000. Retrieved October 13, 2009.

Rubin, Jeffrey, and Dana Chisnell. Handbook of Usability Testing. Indianapolis, IN: Wiley, 2008.

Szuc, Daniel. “Finding Gold in Your User Research Results.” UXmatters, July 6, 2009. Retrieved October 13, 2009.

usability.gov. “Usability Testing Guidelines.” U.S. Department of Health and Human Services. Retrieved October 13, 2009.

6 Comments

Great article. Curious what readers and the panelists themselves think about some of the newer, low-cost, online usability testing tools. Website Magazine reviewed a few of them here: “Turn Right On Usability Lane.” (Full disclosure: I work with one of them, UserTesting.com.) Honest, open feedback welcome.

Hi Brett,

It’s interesting that you should ask this question. I just wrote an article for UXmatters on remote, unmoderated usability testing that should be published in the near future. To answer your question now, I’d have to say that unmoderated usability testing can yield an enormous amount of feedback and data from your users, but it should not be used as a replacement for moderated usability testing. I believe unmoderated testing is best used in conjunction with moderated testing. You can use the large samples to help put numbers behind the key findings from your initial moderated research.

There are lots of pros and cons to using different unmoderated usability testing tools, and you need to evaluate them before deciding to conduct an unmoderated usability test.

Glad you liked it, Brett. I think Janet does a great job.

First, almost any testing is likely to be better than no testing, and from what I have seen, yours is much better than no testing. As a straightforward task completion testing tool it seems to help.

We just started a new project for a company who had done some testing. (I believe it was with your product, but maybe not.) They felt that they had improved their site as a result. There were no usability experts involved with the testing.

So, I like it, because it has started them on the path to more testing! I doubt they realize the limitations of a tool like this and that may be a risk. They didn’t get much, if any, user feedback or solutions from it. One thing I like about direct testing is the range of great ideas and solutions that come up during testing.

Remote automated testing cannot reveal any of the emotional responses, brand impact, long-term messaging, or anxiety that live, expert testing can reveal. I am happy to go into this in more detail, because I am quite sure that the testing we do is atypical.

If the uneducated business owner thinks this represents everything that direct usability testing has to offer, his business could suffer as a result.

PS—The site I was speaking about improved their task completion, but little more than that.

Hi Brett,

I’ve been using Feng-GUI for almost 2 years now, and I like it alot. Before it came out, we had to mess around with MATLAB to create these saliency heat maps. Saliency algorithms obviously cannot replace eyetracking, but they’re good enough to quickly test how different elements in a UI draw visual attention. Using Feng-GUI, I find it very easy to explain to clients what the effect of pop-out features are and how to use them to shape design.

Our team is based on an internal consulting model. As such, we have to consider not just the user perspective, but the satisfaction of our customers: internal project teams of IT, business, and/or marketing professionals. We are a very small UX Services team, in a large company with large-scale enterprise projects. The types of projects run the gamut from one-page form design, to complex UI integration efforts between Web pages and enterprise software such as SAP and Pega. And, of course, budget and timelines rule.

In an environment like this, we have to do expert reviews sometimes. Our twist is to provide a team expert review with a very fast turn around, delivering recommendations in a meeting instead of a report. There are two goals: discover some really obvious usability bloopers and deliver a good experience for project managers and developers so they’ll come back—next time, for a usability test. The team expert review is the hook that hopefully brings in future work. We’re working out the kinks on how best to run these team reviews, like how to facilitate the discussion and what to focus on—heuristics? task flows? both? Even so, project teams have come away with some good recommendations and have expressed their thanks to us.

If given the choice, I would prefer to do a 3–5 person usability test, as per Steve Krug, over an expert review or heuristic evaluation. Grabbing folks in the cafeteria to walk through screen shots is okay for a tight, iterative process. On the other hand, I get raised eyebrows from internal marketing clients about our tiny sample sizes. Our engagements with marketing are different from those with IT. We’re not there yet, but once we introduce assessments of persuasive design and emotion, we’ll need to get much bigger sample sizes. From my readings and talks with colleagues, this is where remote unmoderated testing comes in. This is new territory for me for sure—statistics, yikes!—but people are doing this already. (Bill Albert, Director of Bentley’s Design and Usability Center, just delivered a paper to the New Hampshire UPA chapter about large-scale usability testing.)

In a consulting model, it comes down to knowing our users and our customers. What lingo do they speak? What data do they need? What do they want to measure and what methods are best for the context? Hey, where have I heard this before???

Todd and Kyle both said something very important: You have to be clear on both what questions you are trying to answer and what questions the approach you are considering can answer. If you ask the wrong question, your answers won’t be much use in improving your product. If you choose the wrong technique, you can get the wrong answers.

I also agree with them that there is no substitute for at least some time spent in the direct company of people using your product. This does not have to be a huge, expensive, time-consuming project, but it is critical that you find some way to meet users directly. I find that some of the new tools are most valuable when I already have a good understanding of users and their context, but need a quick answer to a relatively simple question.

They are also useful as part of a larger project, adding data. For example, on a recent project, we used moderated usability testing, but added some 5-second tests to try to understand what information popped out of the visual design most quickly. Or, we have triangulated between server logs—for a quantitative view of user behavior—and talk aloud—for a deeper, qualitative view of how users interact with a page.

It’s great to have more options, but it means you have to think more carefully about the choices you make. There is simply no one-size-fits-all user research or usability testing.

Join the Discussion

Asterisks (*) indicate required information.