Usability testing can seem deceptively easy. You ask people to perform tasks using a user interface, observe what they do, and ask them questions. Sounds simple, right? In comparison to field studies and other, more advanced user-research methods, usability testing might seem like the simplest technique to learn and perform. Perhaps it’s the repetition of observing multiple participants, performing the same tasks and answering the same questions, that makes usability testing begin to seem routine—like something you could do in your sleep.
However, although usability testing may seem simple and routine, anyone who has conducted a lot of testing can testify about the many problems that can occur. In this column, I’ll discuss some of the biggest mistakes you can make in doing usability testing and how to prevent them. But, first, I’d like to make a distinction between mistakes and problems. Mistakes are preventable, while problems are often beyond your control,—for example, experiencing technical difficulties in the middle of a test session. Everyone makes mistakes—even experienced usability professionals. But reviewing these common mistakes will help you to avoid them.
Not Spending Enough Time Planning a Study
Planning is the most important part of the usability-testing process. Yet, people often underestimate how much work it involves, so don’t devote enough time to it. Planning includes defining the goals for a study, the types of people with whom to test, the testing methods to use, the tasks to include, and the questions to ask. These decisions go into the study’s discussion guide, which is the script for tasks and questions the facilitator will follow with the participants. Careful planning is necessary to ensure that the tasks and questions are understandable, occur in a logical order, and help to avoid bias.
To ensure that you spend enough time on planning:
Make sure your project plan includes a specific task for usability-test planning and time that you’ll dedicate specifically to that task. Ideally, planning should take at least two days.
If the person who is planning the testing has additional duties such as creating a prototype, ensure that those duties don’t overlap with the usability-test planning task.
Provide time for the client and project team to review and approve the discussion guide, including time to revise the guide if needed.
Not Involving Clients and Project-Team Members in Planning the Testing
It may be tempting to avoid micromanagement by not including clients and project-team members in planning your study, but that’s a big mistake. Involving stakeholders in planning and recruiting ensures that testing focuses on answering their questions. This makes them feel more involved, interested, and invested in the results—and they can’t blame you for the results of things that everyone reviewed and approved. Plus, they are more likely to act on your recommendations if they feel they were part of the testing.
To involve your clients and project-team members in usability testing:
Meet with everyone to determine the goals of testing and the types of participants to include.
Review your discussion guide with the team to get their feedback and approval.
Have them review and approve the recruiting screener, as well as the list of participants to recruit, to ensure that everyone agrees that you’re recruiting the right people for your study.
Encourage everyone to observe the testing.
Testing with the Wrong People
Testing with unrepresentative participants wastes your time and invalidates your results. So it’s extremely important to find and recruit representative users. Although this makes obvious sense, there are many ways to screw this up, including the following:
failing to identify the correct user groups to include in testing
failing to screen participants properly
recruiting professional focus-group participants, who are only in it for the money
testing with friends, family members, and coworkers, simply because they’re easily available, even though they may not fit the profile of the users
relying on a client to recruit participants, but not providing them with enough information about who to recruit and who to avoid
To ensure that you recruit and test with representative users:
Determine which user groups to include in the testing.
Decide what important characteristics define those user groups. Use those characteristics to screen potential participants and ensure they are representative of your users.
Don’t get caught up in recruiting by demographics, unless there are important demographics that define your user groups. For example, income probably isn’t that important when testing a retail pharmacy’s Web site, but it would be very important in testing an expensive, luxury cruise liner’s Web site.
Use lists of existing users—whether customers, members, or employees—whenever you can. These people are already actual users and require little, if any, screening.
If your clients are doing the recruiting, provide them with details about the types of people to recruit and those to avoid.
Avoid using recruiting companies that rely heavily on people who are already in their database.
You can’t avoid getting no-shows, last-minute cancellations, or the occasional participant who slips through screening without really fitting the user profile. But, if you aren’t prepared for these situations, you can end up not having enough participants.
To minimize the impact of no-shows and cancellations:
Locate and schedule the test sessions at a place and time that’s convenient for participants. For example, if your participants live in the suburbs and your test requires a long drive into the city, through rush-hour traffic, you’ll probably get more no-shows.
Provide lucrative incentives to encourage people to keep their scheduled commitments.
Recruit a few more participants than you need, so even if you get some no-shows, you’ll still have enough participants.
If you’ll have many observers or you’ll be renting an expensive testing facility, recruit extra, stand-by participants, who can sit in a waiting room until you need them and fill in for no-shows. Although you’ll have to pay them a higher incentive to sit around waiting, that’s less expensive than wasting time in an expensive lab or wasting the time of many observers.
Set the expectation with your client and observers that no-shows, last-minute cancellations, and the occasional disqualified participant are unavoidable aspects of usability testing. They are less likely to get upset if you’ve set this expectation up front.
Trying to Test Too Much
I always find it difficult to restrain myself from trying to test too much. For each project, there many possible tasks and questions you’d like to include in your usability study. Yet, you can fit only so much into the typical, hour-long session. If you try to include too much, you’ll either run out of time or find yourself rushing through test sessions, skipping tasks, and not having enough time to ask probing, follow-up questions.
To avoid cramming too much into test sessions:
First, list out everything you want to learn from your testing, without censoring yourself or considering the time it will take.
Prioritize your lists of tasks and questions, focusing on the most important items first and eliminating low-priority items.
If you need to include lower-priority tasks, put them in an optional section at the end of your discussion guide. Then, conduct those tasks only if there’s time left toward the end of a session.
If you can’t move lower-priority tasks and questions to the end of the discussion guide, but really must include them, mark them as low-priority or optional and can skip them if you’re running short on time.
If you still have a lot to test, consider making the sessions longer—up to 90 minutes or two hours, if necessary. Or break up your study into several rounds, each focusing on different functions or tasks.
Err on the side of scheduling too much time per session. It’s always better to have extra time at the end and not need it than to run out of time. Extending a 60-minute session to 90-minutes can make all the difference.
Do a pilot test to determine how long a test session will take, but remember, some people take longer than others, either because they encounter more problems or are more talkative.
Testing Too Many Versions
Testing multiple versions of a design solution is a great way to compare different design directions, but if you try to test too many versions, test sessions become overly complicated. Participants can become overwhelmed and have difficulty remembering the various differences between the different versions they’ve experienced. Plus, the facilitator has the added challenge of keeping track of all the versions and rotating the order in which participants experience them to avoid order effects.
To avoid testing too many versions of a design solution:
During the design process, stress the importance of actually making design decisions rather than leaving everything an open question or dispute that you’ll solve through testing.
Limit testing to two or three different versions at most.
Ensure that the various versions of a design are clearly different. If the differences are so minor that participants have difficulty noticing them, that’s an indication there were design decisions the team should already have made on their own.
Make the tradeoffs of testing multiple versions clear to your client and project team. The more versions you test, the fewer tasks and questions you’ll be able to include for each one.
Waiting Too Long and Conducting Only One Usability Study
A common mistake of organizations with less UX maturity is to conduct only one usability study, near the end of the design process—or worse, at the end of development. Usability testing is most effective when it’s part of an iterative design process that involves evaluating designs through multiple usability tests, starting early in the design phase. By the end of the design process, designers will have spent a lot of time on the design, and its direction will already be set. Finding major problems at this point and making changes to resolve them requires a lot more work. A far bigger gamble is waiting until the end of development to conduct usability testing.
Regardless of a project’s stage in the development lifecycle, conducting only one usability study is always a mistake. Yes, one usability study is better than nothing. It lets you find problems that you can correct. But, when doing only one round of testing, you can’t evaluate whether your design changes have solved the problems or created additional problems.
To get the most out of usability testing:
Include at least two rounds of testing and design iterations in your project plan. If time or cost is an issue, split one larger study into two smaller studies.
Conduct at least one round of usability testing early in the design process, with a low-fidelity prototype, then conduct the second round with a slightly higher-fidelity prototype.
Ideally, include multiple rounds of usability testing, starting early in the design process.
Not Conducting a Pilot Test
It’s tempting to skip a pilot test, especially when you’re short on time. A pilot test is basically a rehearsal, with someone—often a coworker—playing the part of the participant. The purpose of a pilot test is to discover potential problems in your planning that you can fix before the real testing begins. It can help you assess the following:
Is the test too long?
Are the tasks understandable?
Does the order of the tasks make sense?
Are the tasks or questions biased?
Are the questions repetitive?
Does the prototype work correctly for these tasks?
To evaluate your test plan before the first participant arrives:
Include time in your project plan for a pilot test, then changing the discussion guide and prototype, as necessary. Designating a specific task for pilot testing in the project plan will make it harder to skip this important step.
If possible, conduct the pilot test with a real participant rather than a coworker. This lets you get a more realistic sense of how the test plan is working.
Interrupting Test Sessions
Unless there’s a real emergency, never interrupt a usability-test session. Almost always, the comfort of the participant, the concentration of the facilitator, and the flow of the tasks are far more important than whatever interruption you might consider necessary. Usability testing can seem artificial, uncomfortable, and strange enough for participants as it is. The facilitator spends a lot of time trying to put participants at ease and overcoming the unnaturalness of the situation, so he or she can observe participants performing tasks as naturally as possible and, thus, get their honest feedback. The last thing you should do is throw things off with a disruption.
Why would someone want to interrupt a smoothly running usability test? Here are a few examples of situations I’ve encountered:
An observer wants to ask a question.
The audio in the lab or on the conference call isn’t working.
The conference call or Web conference stopped working.
The observers can’t see the live eyetracking display.
While some of these are important problems, none of them is worth disrupting a session.
To avoid unwanted disruptions:
Make it clear to observers that the test session itself is so important that they should not interrupt it unless there is an actual emergency.
Assign someone other than the moderator to be the contact person for technical issues and other problems. Tell observers to contact this person if there are issues. Ask this person to enforce the non-interruption policy.
Provide a time and method for getting questions from the observers at the end of each session. For example, you could excuse yourself and go to the observation room to gather any questions they would like to ask, then return to the participant and ask those questions.
Forgetting That the Facilitator Is a Human Being
Usability testing is very difficult, mental work. In addition to the interpersonal skills that making a participant feel comfortable and talkative requires, the facilitator must have good observation, listening, and note-taking skills. Facilitators must constantly assess their degree of understanding, decide whether and when to ask probing questions, and determine how to phrase questions to avoid bias. Perhaps most importantly, the facilitator has to have the patience to sit through many test sessions, asking different participants the same questions, seeing the same screens again and again, and hearing similar responses, without getting bored, jaded, or going insane.
This is enough for any facilitator to deal with, but other stakeholders sometimes add to their burden by scheduling too many sessions in one day, not scheduling any breaks between sessions, forgetting to provide food for the facilitator, or bothering the facilitator with additional requests. Adding to the facilitator’s already heavy load only degrades the quality of the testing.
Ease the burden on the facilitator by doing the following:
Treat the facilitator like a rock star. That may sound a little over the top, but that’s actually a good way to think about the facilitator on testing days. The quality of the usability testing depends on how well the facilitator can focus on the sessions. So do whatever you can to keep the facilitator relaxed, well fed, and hydrated. Making it your philosophy to consider the facilitator the most valuable person on testing days helps prevent people from adding to his or her burden.
Designate one or more people to help the observers, answer their questions, order food and beverages, adjust the temperature in the observation room, fix technical issues, coordinate participants, and deal with any other issues. Make sure no one asks the facilitator to deal with any of these things.
Try not to schedule more than four to six one-hour sessions per day. Four is ideal, but if you’re renting an expensive lab or focus-group facility or the facilitator is traveling to a particular location prior to the testing, doing six sessions is acceptable. However, if the sessions are longer than an hour, you should schedule fewer sessions per day.
Include at least a 30-minute break between sessions.
While it can be helpful to discuss the test results with the facilitator between sessions, you must also give him or her some downtime, to get away from the observation room to rest. After being on high mental alert for an hour during a usability-test session, going immediately into a debrief with the observers is not a break.
Provide meals, snacks, and beverages for the facilitator, as well as time to eat and drink.
Being Dismissive of Participants Who Don’t Match Your Expectations
Some observers react defensively when a participant says or does things with which they disagree and dismiss that participant as being unrepresentative of actual users. Because they want to avoid facing the unpleasant truth, they may discount everything the participant has said or done. “He doesn’t represent our customer.” “She wouldn’t really use our product anyway.”
To prevent observers from being dismissive of participants:
Encourage observers to keep an open mind when participants say or do something unexpected and to resist the urge to immediately dismiss what they say.
Remind them to wait for the results from all of the sessions before coming to any conclusions. One outlier participant won’t affect the results, but if similar results from other participants back up what that person says, it may be valid data.
Prematurely Jumping to Conclusions
When I conduct usability testing, I notice patterns, but I try to keep an open mind and avoid forming conclusions between sessions. Yet, I often find that observers begin to jump to conclusions after seeing only one or two participants. Some even begin to discuss design changes at that point.
To help observers avoid jumping to conclusions:
Before the test sessions, emphasize the importance of keeping an open mind and avoiding premature conclusions. Tell the observers that it’s okay to note patterns between sessions, but to realize that those patterns may change or look different once they’ve seen all of the participants.
Encourage observers to avoid making design decisions until you’ve analyzed all of the data.
In the deliverable that documents your findings, include the numbers of participants who made errors, took actions, and made comments. These numbers help you to overrule observers who have fixated on what one particular participant did or said.
Not Providing Enough Time for Analyzing the Results
I’m always amazed when a project team spends a lot of time conducting usability testing, then doesn’t allow enough time for analyzing the results. It’s understandable that everyone is anxious to learn what the results are, make the necessary changes, then move on to the next step in the project. But analyzing the results and making recommendations really does take some time. After spending so much time and money on testing, it’s a waste of time and money to speed through the analysis process.
To avoid rushing through analysis:
Ensure that the project plan provides a reasonable amount of time for analyzing the findings—based on the number of participants, the number of tasks and questions, and the complexity of the user interface.
If the team wants results quickly, provide a high-level review of your initial findings, then spend additional time on detailed analysis.
Not Creating a Deliverable to Document Your Findings
Another way in which people try to save time is by not creating a deliverable documenting findings. “Reports and presentations are boring,” they claim. “Let’s just have a meeting to talk about what you’ve found.” The problem is that usability-study findings and recommendations are sometimes complicated and difficult to describe without visuals. Unless you create a deliverable, those who aren’t in the meeting in which you report your findings will never know the results. Without a permanent record of your findings, that information will get lost, and you’ll run the risk of recreating the same problems later on.
To avoid situations in which there would be no deliverable documenting your findings:
Create at least an informal deliverable. It doesn’t have to be a fancy report or presentation. It can be a series of screenshots and bullet points.
If the team wants a quick turnaround, create a high-level, informal deliverable first to get the main points across. Then take the time to put together a more comprehensive deliverable.
A Lot Can Go Wrong!
Although usability testing can sometimes be the most highly planned and structured type of user research, plenty can still go wrong. I’m sure that many of you reading this column have your own horror stories. And it’s likely that there are other usability-testing mistakes I haven’t mentioned here. There are some problems you can neither predict nor prevent, but we can avoid making these common mistakes by being aware of them and by learning from our mistakes.
Principal User Experience Architect at Infragistics
Cranbury, New Jersey, USA
Jim has spent most of the 21st Century researching and designing intuitive and satisfying user experiences. As a UX consultant, he has worked on Web sites, mobile apps, intranets, Web applications, software, and business applications for financial, pharmaceutical, medical, entertainment, retail, technology, and government clients. He has a Masters of Science degree in Human-Computer Interaction from DePaul University. Read More