Writing Usability Requirements and Metrics
Published: February 9, 2009
In this installment of Ask UXmatters, our experts discuss how to write effective usability requirements and metrics for the redesign of a legacy public sector system.
Ask UXmatters exists to answer your questions about user experience matters. If you want to read our experts’ responses to your questions in an upcoming installment of Ask UXmatters, please send your questions to: email@example.com.
Q: My task at hand is to write nonfunctional requirements for the usability quality of a large, new Danish system for handling social services in the public sector. There is an existing product, but it is 25 years old and runs on a CICS mainframe. The classic answer to my question is to decide upon usability factors such as learnability, understandability, or efficiency, then establish metrics like the following:
- Six out of ten novice users shall perform task X in Y minutes.
- At most, one in five novices shall encounter critical problems during tasks Q and R.
But still, where do I start? And how do I choose the numbers in these metrics—for example, six out of ten and one in five? Some literature suggests requirements should be written more as general concepts or ideas for interactions, but I need these usability requirements to be strong, because we’re outsourcing the design. Therefore, we must use these requirements to assess and secure the ongoing quality of what the vendor delivers.
So far, it seems to be very difficult for my organization, with no direct prior experience, to decide what metrics or requirements to use. I will suggest something, but I must be prepared to discuss, fight for, and change what I’ve proposed. So, what experience can I draw upon? Can you offer any advice for this process? Where do I start?—Ole Gregersen, Information Architect
The following experts have contributed answers to this question:
- Mike Hughes—User Assistance Architect at IBM Internet Security Systems and UXmatters columnist
- Whitney Quesenbery—Principal Consultant at Whitney Interactive Design; Past-President, Usability Professionals’ Association (UPA); Fellow, Society for Technical Communications (STC); and UXmatters columnist
- Mary Theofanos—Chair of the Industry Usability Reporting Project Working Group at the National Institute of Standards and Technology (NIST)
Whitney responds, “First, I want to applaud you and your organization for trying to create a way to ensure that the system will meet specific user experience goals. This is especially important when you are outsourcing the design work and can have little control over the process except by setting requirements for the outcome.”
Mary shared some work they’re doing at NIST: “The Visualization and Usability Group (VUG) at NIST has also recognized the need for defining usability requirements in enough detail to influence design and allow us to validate them. Working with partners in industry and academia, we have been working to develop guidelines for specifying usability requirements. The working group developed the Common Industry Specification for Usability Requirements (CISUR).”
The Difficulty of Writing Usability Requirements
“Your classic examples are just examples, so I won’t pick on them too much,” says Whitney. “But you are right that they are not very helpful. For one thing, they raise more questions than they answer—for example, What is a novice user? What is a critical problem?”
Mike offers this opinion: “I think we overemphasize metrics when it comes to usability. They often introduce problems in terms of the validity and reliability of the data. First, ask whether a metric is a valid measure of usability. For example, the amount of time it takes to perform a task is often a usability goal—as in your example Six out of ten novice users shall perform task X in Y minutes. What if all of the participants took more than Y minutes to complete the task, but everyone said, ‘This is the best application for doing this task I have ever experienced. I’m recommending this product to all my friends.’ Does the product fail to meet the requirements regarding that task? What if all participants perform the task in less than Y minutes, but everyone says, ‘That was a horrible experience. I would never buy this product.’ Does the product satisfy requirements?
“The problem with these metrics is exactly the one you have articulated: What is a good number? Such measures are invalid unless
- you can derive a metric from prior data—For example, you might have Web metrics that show, if users cannot complete a purchase in Y minutes, the abandonment rate goes up.
- a metric is intrinsic to the product—For example, if a medical team cannot complete a triage procedure in less than Y minutes, the patient dies.”
Where to Start When Defining Usability Requirements
Getting more specific about how to approach defining usability requirements, Whitney says, “You need to start by describing what aspects of the user experience are critical to the success of the new program. Let me give you some examples of what I mean. Is absolute time-on-task efficiency important? For systems that support repetitive, short tasks, it might be. In that case, you might want to include a metric for the time it takes to complete a typical core task.
“But it might be that accuracy—effectiveness—is more important. By this, I mean that someone using the system is able to complete tasks correctly and completely—both simple tasks and those that are more complicated. This is more than learnability. It speaks to whether the system supports users in both understanding and completing goals—which might involve several software tasks.”
At NIST, according to Mary, “The CISUR identifies three types of information you need to determine:
- the context of use—the intended users, their goals and tasks, associated equipment, and the physical and social environment in which people use the product
- performance and satisfaction criteria—measures of usability for the product
- test method and the context of testing—the method of testing whether the product meets usability requirements and the context in which the team measures usability
“This approach defines three levels of specifications, each providing more detail about the context of use, performance and satisfaction criteria, and the test method, allowing organizations to evolve their requirements during the design and development process.”
Whitney asks, “Is this a system that is typically used by people who don’t know the social services system—that is, the public—or is used only infrequently or is for one-time use? Or is it used by employees who know how to do their jobs and just need the system to support their work? Your definitions of the users should be contextual, not general descriptions like novice. And remember, even people who use a system all the time are novices whenever they have to handle an unusual or difficult task. I’d bet public social services are full of exception cases that don’t fit into nice neat buckets.
“So, to summarize this point, you need to make sure your requirements are clear about the context of use—who the users are, what their goals are, and the conditions under which they use the system to meet those goals.”
“Your question about where you get the measurement values is a much harder point to address,” Whitney continues. “The most important thing to tell you is that you don’t just make them up. You might want to set these values in comparison to metrics for the current system. I assume you want the new system to be at least as good as the old one—and hopefully, better. You could run a benchmarking usability test, in which you collect measurements for each of your requirements on the old system, then set those as the benchmarks for the new one. You might be able to get some of the numbers from log analysis, if the current system captures enough data about time on task, error rates, and so on.”
Mike asks, “Will the measurements and your inferences about the user population be reliable? Does 6 out of 10 mean 60% of the user population, or do you just have 10 particular users you care about? If you really mean the entire population of your users, be prepared to crunch the numbers for statistical significance and confidence intervals. It’s not enough to quote Jakob Nielsen saying x number of users can give reliable results if certain conditions are met. You still have to crunch your numbers to see whether that was true in your case.”
Whitney adds, “The overall goals for the new version should also guide you. Even if the real motivation is to change from a mainframe platform, there are probably management goals for improvements. These will also give you clues about the kinds of things you should use to see whether the new system is living up to expectations.
“You also have to decide whether you are going to assess a range of activities that are representative of the entire system. Choosing specific activities makes the requirements definitive, but also presents the risk that the designers will design the system to optimize those tasks. So be careful to choose activities with interactions that are typical for a range of activities.”
Testing the Design
Whitney recommends, “Think carefully about how to construct the usability tests to make them realistic. Do users use the system while talking to someone on the phone, reading from forms customers have submitted, or collecting information from various sources? Instead of creating clearly defined tasks for test participants to complete, you could put together a collection of tasks that might occur in a typical day’s work and ask participants to complete the whole set of tasks as they would in the course of their normal work. Taking this approach would help you avoid stating tasks in terms of the system’s structure or interactions.
“Remember, when you test for compliance with your requirements, you are not looking for usability problems to fix, but testing overall performance. Of course, any information you collect about the source of usability problems will help you and your vendor remedy any failures, so there is value to collecting lists of problems participants encounter.”
Mary tells us, “The NIST effort has led to the creation of a new International Organization of Standards (ISO) working group to identify and define a framework and consistent terminology for the specification and evaluation of the usability of an interactive system. This framework will include the following types of documentation, which provide the information that is required to apply human-centered design to the development of an interactive system:
- Context of Use Description
- User Needs Report
- User Requirements Specification
- User Interaction Specification
- User Interface Specification
- Evaluation Report
- Field Data Report
“An ISO Technical Report that defines the framework and its elements is currently under ballot and should be available in about a year. We hope these resources will be of benefit to organizations and can provide guidance in the difficult area of defining usability requirements.” The NIST Industry Usability Reporting Web site offers more information and working group documents.
Some Final Words of Advice
In summary, Whitney says, “The question you are trying to answer in writing these requirements is What aspects of the system design will affect the user experience, and how can I make sure the new design will address them well? You’ll probably find this definition harder to write than the specific values for the benchmarks. Your usability requirements should describe the context of use: who, what, when, where, and why.
“The specific activities the requirements describe should reflect both a range of user goals that the system must support and business goals for creating the new system. Your measurements should be in direct comparison to the older system, or based on actually running the test tasks on the old system.”
Mike offers this advice: “My recommendation is to specify what kind of usability testing you’ll do and when in the development process you’ll do usability testing. Avoid defining meaningless metrics. Let user data drive them. For more information on this topic, please refer to my article “Rigor in Usability Testing.”