Usability Testing for Survey Research

By Emily Geisen and Jennifer Romano Bergstrom

October 9, 2017

Chapter 2: Respondent–Survey Interaction

Usability Testing for Survey Research Usability testing allows an in-depth evaluation of how respondents interact with surveys and how this interaction affects the quality of a survey. For example, a respondent may understand the survey question and response options, but may have difficulty selecting an answer accurately on a smartphone’s small screen.

To begin to understand how usability testing can be used to identify potential problems with surveys and improve the overall quality of data collected, we consider the different types of error that can occur in the survey process.

Champion Advertisement

Sources of Potential Errors in Surveys

The two major categories of survey errors, as shown in Table 2.1, are errors of nonobservation and errors of observation (Groves, 1989). We will look at them one at a time.

Table 2.1—Sources of error in surveys
Errors of Nonobservation	Errors of Observation
Coverage Sampling Nonresponse	Interviewer Instrument Respondent Mode

Errors of Nonobservation

As the name suggests, errors of nonobservation occur when certain members of the target population are not included in the survey. These errors further group into coverage, sampling, and nonresponse errors. Coverage errors occur when members of the population of interest are not in the sampling frame—that is, the list of individuals, businesses, or households used to select the sample. Sampling errors occur because our survey estimates are produced from only a subset of the population of interest.

In usability testing, we are not concerned with coverage errors or sampling errors. We are concerned to some extent, though, with nonresponse errors.

Nonresponse errors occur when survey responders are systematically different from nonresponders on the key concepts the survey is measuring. For example, if a survey intends to measure customer satisfaction and only unhappy customers respond to the survey, the results will not reflect the opinions of all customers.

When surveys are difficult to use, respondents may break off, which will lead to nonresponse errors if those who break off are different from those who complete the survey. Because usability testing focuses on respondents’ experiences interacting with the survey, we are interested in learning about issues that might cause people to quit the survey. However, break-offs are just one type of nonresponse error. Usability testing will not identify other reasons people have for not responding to a survey.

Errors of Observation

Although we strive to reduce break-offs, we are primarily concerned with reducing errors of observation, also known as measurement errors. These occur when the true value is different from the value reported by the respondent. For example, the question, “In the past 12 months, how many times have you seen a doctor?” may not provide us with an accurate account. First, a respondent may not be able to recall every time they saw a doctor. Second, the respondent’s understanding of seen a doctor may be different from the researcher’s. For example, do nurses or other types of healthcare professionals count? Does it count if the respondent spoke with the doctor on the phone?

Measurement errors in surveys can come from any of these sources: interviewer, respondent, instrument, and mode of administration (Biemer, 2010; Groves, 1989).

Interviewer errors—for interviewer-administered surveys—occur when respondents’ answers differ due to the ways that interviewers read and administer the survey. This may be for any number of reasons related to the interviewers’ performance or the interviewers themselves. For example, an interviewer might misread a question or record a response incorrectly. Respondents’ answers also can be affected by the interviewer’s manner and appearance, tone, feedback, and behavior. For example, young people may answer differently when the interviewer is close to their age compared to older (Davis & Scott, 1995).
Instrument errors arise from a problem with the wording and ordering of the survey questions or the layout of the survey instrument. For example, small question-wording changes—such as asking about Obamacare instead of the Affordable Care Act—can affect how respondents answer questions: a 2013 CNBC poll found that when Obamacare was used in the question, 46% of respondents opposed the law, compared with 37% who opposed the Affordable Care Act.”Similarly, visual design features, such as listing response options horizontally instead of vertically, can affect the distribution of responses to identically worded questions (Toepoel, Das, & van Soest, 2009).
Respondent errors occur when differences in respondents’ experiences, cognitive ability, and motivation affect responses. For example, some respondents may be better able than others to recall how many times they have been to a doctor. Some respondents may interpret doctor to include nurses and other healthcare professionals, and other respondents may not. Finally, some respondents may not want to report the true number of times they went to the doctor.
Mode-effects errors occur when the mode of the survey—for example, mail, telephone, Web—introduces differences in survey results. For example, telephone surveys often suffer from recency effects, where participants are more likely to recall the last response options read to them, while mail surveys often suffer from primacy effects, where participants are more likely to select the first response options (Krosnick, 1991). Mode error can also occur when respondents complete a survey on different devices. For example, Stapleton (2013) compared surveys administered on laptops and mobile devices. They found that when only some of the response options were visible on a mobile device without scrolling, respondents were significantly more likely to choose response options on the visible part of the screen.

When evaluating the usability of a survey, we are concerned with instrument, respondent, and mode-effects types of measurement errors and some types of interviewer errors. Usability testing can and should be conducted with interviewer-administered surveys, but in these situations, the interviewers are our users. Even though they are not formulating an answer to the survey Sources of Potential Errors in Surveys 23 question the way a respondent would, interviewers are the ones using and interacting with the survey to record respondents’ answers. Usability issues in the survey that affect some interviewers more than others could introduce interviewer errors.

Usability testing allows us to identify potential issues in the usability of the survey that may lead to measurement or nonresponse errors. The goal is then to reduce these errors through iterative usability testing. To do this, we evaluate how well respondents can use and interact with the survey instrument to provide their responses. We observe and evaluate what works well and what does not. Then the survey is revised and tested again to see if we have resolved the issues.

This error-reduction process hinges on our ability to identify potential sources of errors in surveys, then understand why these errors occurred, so we can revise our surveys to correct for them. We next explore how respondents answer and respond to survey questions, particularly self-administered survey questions.

The goal is to identify and reduce potential sources of error through iterative usability testing.

How Respondents Answer Survey Questions

To evaluate surveys well, we must examine both how respondents understand and answer survey questions and how they interact with surveys. Let us start with an overview of the response processes respondents use to answer survey questions.

The response formation model (Tourangeau, 1984; Tourangeau, Rips, & Rasinski, 2000; also see Willis, 2005, for more discussion) is a conceptual model that shows the four steps and associated cognitive processes that respondents follow when answering survey questions.

Comprehension:
- question focus—Determining the intent and meaning of the question and instructions.
- context—Assigning meaning to specific words used in a question.
Retrieval:
- strategy—Deciding on retrieval strategies such as episodic enumeration—that is, listing and counting individual events; estimating—for example, about twice a week; or inferring—for example, less often than I go to the store.
- attitude recall—Consulting memory for relevant information.
- factual recall—Recalling specific or general memories.
Judgment:
- compile—Compiling information retrieved to generate a relevant answer.
- motivation—Deciding how much cognitive effort to expend to retrieve an answer.
- sensitivity—Determining whether to tell the truth or present a more socially desirable answer.
Response:
- response selection—Selecting a response category that best represents the respondent-derived answer.

A respondent will not necessarily go through these steps in order and may not go through them all when answering a question. For certain questions, the necessary cognitive processes are automatic. The premise of the model, though, is that each specific process must be successful to prevent error. Therefore a breakdown in any one of these cognitive processes could introduce measurement error.

But, a question can be problematic even if a respondent follows the four-step process. For example, Chepp and Gray (2014) acknowledge that how a respondent answers questions is also informed by their “social experiences and cultural contexts.” Researchers have observed differences in response style based on respondents’ cultural background (Harzing, 2006). Differences include acquiescence—the tendency to agree with survey questions—and extreme response—selecting the endpoints on the scale.

The more we understand how respondents answer survey questions, the better we will be at designing usable surveys that yield high-quality data and accurately address our research objectives. In fact, many textbooks and courses on survey-research methods rely on this model as the theoretical foundation for questionnaire-design principles—for example, Fowler, 2014; Groves et al., 2004; Tourangeau et al., 2000.

How Respondents Interact with Survey

How respondents use and navigate a survey also affects the data collected from surveys. Figure 2.1 shows a series of slider questions that were included on a Web survey (Romano Bergstrom & Strohl, 2013). These questions were usability tested to evaluate how well people were able to complete the survey. Testing revealed a problem related to how participants used the slider to indicate their selected response: Respondents did not understand that they had to move the slider to indicate a response of Never. When a participant’s answer to a question was Never, he or she often left the slider in its starting position at the far left. However, to accurately reflect an answer of Never, the survey tool required that respondents actively move the slider to the middle of the Never box. If the user did not move the slider, no response was registered, and it was treated as missing data.

Respondents had to move the slider to respond Never. — Figure 2.1—Respondents had to move the slider to respond *Never*.

Image source: Adapted from J.C. Romano Bergstrom & J. Strohl (2013). “Improving Government Web Sites and Surveys with Usability Testing: A Comparison of Methodologies.” In Proceedings from the Federal Committee on Statistical Methodology (FCSM) Conference, November 2013, Washington, DC.

This potential source of error in the survey was not with any of the four stages of cognitive processing described in Tourangeau’s (1984) response formation model. The participants did not have trouble mapping their internally generated answers to the response category Never. The reason they did not input their response correctly is that they did not correctly interpret the screen functionality. If the survey had been fielded without usability testing to identify this error, it would have been impossible to tell if respondents selected Never or simply did not respond.

Groves et al. (2004) noted that, beyond the cognitive processes described in the response formation model, additional aspects of the question-and-answer process occur with self-administered surveys, such as navigating the survey and interpreting instructions about how to complete it. In addition, Web-based survey respondents and interviewers who use computer-assisted interviewing (CAI) must also understand various computer features and functions.

Hansen and Couper (2004) describe additional considerations for visual design and usability evaluation that come into play with surveys administered on computers. Users must:

“Attend to the information provided by the computer, and the feedback it provides in response to their actions in determining their own next actions.”

That is, an interaction occurs between the user and the survey. The user’s actions affect the survey, and the survey’s reactions affect the user. For Web-based surveys, interacting with the computer can also affect how respondents cognitively process survey questions.

So how should issues related to navigating and interacting with surveys be treated when evaluating potential errors? Groves et al. (2004) considered navigation concerns to be part of the comprehension process under the response formation model. Hansen and Couper (2004) address some of the additional processes users go through in their model for the self-administered CAI, shown in Figure 2.2.

Figure 2.2—Model of self-administered, computer-assisted interviewing

Image source: Reproduced with permission from S. E. Hansen & M. P. Couper (2004). “Usability Testing as a Means of Evaluating Computer-Assisted Survey Instruments.” In S. Presser, et al. (Eds.), Methods for Testing and Evaluating Survey Questionnaires. New York, NY: Wiley.

The self-administered, computer-assisted interviewing model starts with the computer screen displayed to the respondent.
1. The respondent must then interpret the screen, which includes interpreting the intent of the question and interpreting the actions that need to be taken on the screen.
2. The respondent then has a choice of seeking more information on the screen or going through the cognitive process of answering the question—for example, recall, judgment, response—both of which require action on the part of the respondent.
Once the respondent has generated an internal response to the question, he or she inputs the response.
The survey then determines whether the input is acceptable.
1. If the response is acceptable, the survey proceeds to the next question or action.
2. If the response is not acceptable, the survey provides feedback to the respondent, such as an error message that indicates that the response is not adequate.

Note that a respondent must also navigate between survey pages or within a given survey screen.

Therefore when thinking about how respondents answer self-administered surveys, just evaluating the cognitive processes of comprehension, retrieval, judgment, and response is not enough. We also need to evaluate the respondent survey interaction, as this affects the final data we can obtain from Web surveys. That is, we want to assess how efficiently and effectively respondents can use the Web survey to accomplish their goals in a way that is pleasing to them. A breakdown in the usability of a survey can affect how respondents answer survey questions.

Just as a breakdown in one of the cognitive processes identified in the response formation model can affect the quality of data we receive, so can a breakdown in the usability of a survey.

Usability Model for Surveys

Building on the work presented by Hansen and Couper (2004), we propose a conceptual process model for the respondent–survey interaction involved in completing a survey. This interaction includes three key components: interpreting, completing, and processing feedback. Whereas the response formation model focuses on how respondents comprehend survey questions, the following Usability Model for Surveys focuses on how respondents use surveys:

Interpreting the design:
1. What meaning do respondents assign to visual design and layout?
2. How do respondents believe the survey works?
Completing actions and navigating:
1. How well does the survey support respondents’ ability to complete tasks and goals?
2. How well do respondents follow navigational cues and instructions?
Processing feedback:
1. How do respondents interpret and react to the survey feedback in response to their actions?
2. How well does the survey help respondents identify, interpret, and resolve errors?

In the Usability Model for Surveys, we highlight the three key usability processes that contribute to how accurate, effective, and satisfying the experience is for respondents. These processes are not mutually exclusive and often work in sequence with each other. As with the cognitive processes in the response formation model, a respondent must process each usability aspect successfully to prevent measurement error. If a respondent does not understand how the survey works, selects the wrong response, accidentally skips a question, or is unable to resolve an error, it will affect the quality of data received from the survey.

Listing these aspects separately from the cognitive processes emphasizes the need to identify potential breakdowns in survey usability, not just other sources of measurement error. Review and evaluate questionnaires with the explicit goal of preventing potential usability problems.

Response Formation Model—How respondents comprehend survey questions
Usability Model for Surveys—How respondents use surveys

The fact that usability testing should be an explicit pretesting goal does not imply that usability issues do not overlap with other potential quality concerns in questionnaires. You can try to focus on usability issues alone, but if participants tell you that they do not understand the question, do not ignore that finding. In fact, it is quite common to conduct combined usability and cognitive testing.

We propose the usability model for both self-administered and interviewer-administered surveys. With interviewer-administered surveys, we are still concerned with how the user, or interviewer, interacts with the survey, which is different from how the respondent uses a self-administered survey. Because interviewers are not answering the questions, they go through different cognitive processes than respondents. Yet, interviewers still interact with the survey in a way that could introduce errors.

Certainly, interviewers interact with a survey quite differently than respondents do because interviewers must also interact with respondents. Another reason that interviewers interact with surveys differently from respondents is that interviewers typically receive training on how to use a survey, and they usually complete the survey numerous times compared to respondents who typically use a survey once. A breakdown in the usability process for either a respondent or an interviewer can lead to measurement error.

In the following sections, we detail the three aspects that comprise the Usability Model for Surveys.

Interpreting the Design

When completing a survey, respondents must understand the intent of the survey question, or cognitive process, as well as the actions they must take, or usability process. Because comprehending how something works or behaves is uniquely different from comprehending language—for example, question wording—they should be evaluated separately.

What Meaning Do Respondents Assign to Visual Design and Layout?

When respondents complete self-administered surveys, they must comprehend more than just the words used in the survey question. Respondents also assign meaning to different visual designs (Christian, 2003; Christian, Parsons, & Dillman, 2009; Tourangeau, Couper, & Conrad, 2004). Christian, Dillman, and Smyth (2005) posit that this visual language is as important as question wording when creating survey questions.

Much of this visual-language processing is automatic and subconscious. Tourangeau, Rips, and Rasinski (2004) identified five heuristics respondents use when interpreting visual design:

Middle means typical.
Left and top mean first.
Near means related.
Up means good.
Like means close.

During usability testing, we want to assess how respondents’ interpretation of the visual design and layout affects how they believe the survey works and how they actually interact with the survey.

For example, in completing Question Q1a in Figure 2.3, respondents must decide which radio button goes with which label before they can select the option that corresponds to their response. Due to the spacing of the words and the radio buttons in the cluttered visual design of Q1a, which makes it difficult to determine which radio button goes with which response option, the respondent could easily think the radio button to the right of the answer is the correct choice. In Question Q1b, we see how a different visual design can improve understanding, making it easier for respondents to select the radio button that matches their choice.

Figure 2.3—Cluttered design of Q1a versus clear design of Q1b

During usability testing, we can assess how participants interpret the visual design and how this interpretation affects their interaction with the survey. For example, respondents can have difficulty when the visual design is cluttered by unnecessary images, redundant information, or artistic flourishes that interfere with the task and negatively affect how a respondent interacts with a survey, as shown in Figure 2.4.

Unnecessary elements negatively impact survey's usability — Figure 2.4—Unnecessary elements negatively impact survey’s usability.

Interviewers, not just respondents, can be affected by the visual design and layout of survey questions on their computer screen. To help them administer a computer survey, the screen design can be tailored to include more or less information. For example, the screen may display both the question text to be read and the response box, or it can also display the preceding and subsequent question. Edwards, Schneider, and Brick (2008) found that telephone interviewers more easily administered the survey when multipart questions were presented on a single screen with all parts visible to the interviewer, rather than on separate screens with only one subpart visible at a time. Use of the single screen resulted in fewer errors—for example, interviewer hesitation at participant confusion, wording changes, or disfluent delivery—than multiple screens. In addition, interviewers preferred the single screen and reported less confusion.

Observe how a survey’s visual design can affect the way that interviewers interact with and administer a survey to respondents. For some surveys, presenting too much information on one screen may be burdensome for interviewers. On another survey, it may be helpful to provide the necessary context needed when administering a string of related questions.

How Do Respondents Believe the Survey Works?

When respondents first interact with a survey, they have a mental model of how that survey should work, based on their experiences. This is particularly true when people interact with Web-based surveys, as their mental model maps onto similar interactions in their environment, such as other Web-based surveys, paper surveys, Web sites, mobile devices, and computers in general.

Survey respondents rely on design cues such as radio buttons or check boxes to determine how a survey functions. For many questions, the respondent’s process of assessing how a survey works will be almost automatic. For example, Figure 2.5 shows two identically worded survey questions. The question on the left has radio buttons, and the question on the right has check boxes. Most Web-savvy survey respondents recognize that radio buttons allow only one response, while check boxes allow multiple responses.

Figure 2.5—Question 4 uses .radio buttons; Question 5, check boxes.

However, if respondents do not know that check boxes allow multiple selections, they may answer the question differently than someone who knows this convention. Instead of selecting all races that apply, respondents may select the category they identify with the most. Or they may select Some other race, interpreting that to include people of multiple races.

One of Dillman, Tortora, and Bowker’s (1999) principles for constructing Web surveys is that a “respondent-friendly design must take into account both the logic of how computers operate and the logic of how people expect questionnaires to operate.” One key purpose of usability testing is to evaluate whether the survey adequately supports a respondent’s assessment and understanding of how the survey should function—that is, the mental model. In the previous example, error can be introduced if the survey allows for multiple responses to be checked, but survey respondents do not understand that.

The more we can align survey design with respondents’ behaviors, the easier the survey will be for respondents to use. A challenge arises as we incorporate more technological capabilities into surveys—for example, buttons, links, images, videos, GPS. Respondents may not have specific mental models for these capabilities within a survey context, which makes user-centered design more challenging. In these instances, iterative usability testing is particularly advantageous as it allows for an initial design to be evaluated, tweaked, and retested to ensure that people understand how to use the survey.

Completing Actions and Navigating

How Well Does the Survey Support Respondents’ Ability to Complete Tasks and Goals?

Web-based surveys have numerous features and capabilities that are not available on paper surveys. But with this increased flexibility often comes increased complexity. A basic Web-based survey allows respondents to indicate their response by, for example, selecting a radio button or typing into a text box. With more complex surveys, though, respondents may need to interact with the survey in additional ways to provide their answer to a question. For example, they may need to access a definition, see the previous question for context, play a video, enter multiple pieces of information, or indicate that they do not know an answer. Increased complexity can be particularly problematic in surveys where most respondents are one-time users. They do not have the benefit of repeated use or visits to improve their learning.

Despite the technological advances offered with Web-based surveys, they must be simple and easy to use. For example, several experimental studies have shown that the greater the level of effort required to obtain a definition on a Web-based survey, the less likely respondents are to read the definition. They are more likely to request definitions when only one mouse click is required, compared to two or more clicks (Conrad, Couper, Tourangeau, & Peytchev, 2006; Experiment 1). Respondents are even more likely to view definitions when they only need to rollover the term with the mouse cursor instead of click once (Conrad et al., 2006; Experiment 2). And, when definitions are always visible on screen, respondents are even more likely to read them, compared to when they have to rollover the definition (Galesic, Tourangeau, Couper, & Conrad, 2008; Peytchev, Conrad, Couper, & Tourangeau, 2010).

In the Peytchev et al. (2010) and Galesic et al. (2008) studies, responses to the survey questions differed for respondents who accessed the definitions compared to those who did not. Although the studies did not measure accuracy, per se, the differences in response distribution suggest that reading the definitions affected how respondents answered and interpreted the question.

These studies demonstrate that computer-centric, interactive features like rollover definitions can affect how respondents answer survey questions. Making surveys easier for respondents to use can reduce satisficing and improve data quality.

Satisficing in Surveys

Satisficing occurs when respondents are not willing or able to provide the effort—for example, mouse clicks or movements, or mental calculations—to produce optimal answers to survey questions (Krosnick, 1991). The theory behind satisficing is that, because of the burdens associated with everyday life, people tend to use the smallest amount of effort necessary to satisfy a requirement (Simon, 1957). An example of weak satisficing is when a respondent selects the first reasonable response without reading through all responses to ensure it is the best response. Strong satisficing would be straightlining, when a respondent provides the same answer to all survey questions without actually considering the survey questions.

Krosnick (1991) notes that satisficing is related to three key factors: task difficulty, respondent ability, and respondent motivation.

In usability testing, we assess aspects of the survey that may be unnecessarily difficult, which increase the respondent’s cognitive burden and reduces accuracy. For example, if we notice that survey participants made errors because they were not reading instructions, we should reduce the amount of text by replacing some of it with visual cues. On Web-based surveys, we can use autofills instead of making respondents remember their answer to a previous question.

By reducing complexity, we can increase motivation and improve the quality of responses obtained in surveys. Usability studies have shown that survey

How Well Do Respondents Follow Navigational Cues and Instructions?

A number of design features affect how participants navigate surveys. These include determining what questions to answer, placement of instructions and introductions, placement of Next and Previous buttons, and navigation menus.

To answer survey questions, respondents have to get to the questions and correctly navigate from one question to the next. Geisen et al. (2013) found that usability participants testing a paper survey were immediately drawn to the first question and skipped over instructions or introductions. Participants just want to get started, and question numbers essentially served as signposts to navigate through the instrument.

Usability studies have shown that survey respondents often do not read instructions before attempting to complete the survey.

And whether the survey is on paper or Web, this finding seems to hold. We have demonstrated in countless usability studies that survey respondents do not read instructions—unless they have to. Romano and Chen (2011) found that participants did not read the instructions on the right side of a survey log-in screen, as shown in the fixation gaze plot in Figure 2.6. Instead, they immediately looked for the actionable parts—that is, where they needed to enter their user name and password—to begin the survey. This was problematic because, during the sessions, participants asked why the survey was being conducted and why personal information was being requested—both of which were explained on the log-in screen.

Fixations show participants didn't read instructions. — Figure 2.6—Fixations show participants didn’t read instructions.

Image source: Reproduced with permission from J.C. Romano & J.M. Chen (2011). “A Usability and Eye-tracking Evaluation of Four Versions of the Online National Survey for College Graduates (NSCG): Iteration 2. Statistical Research Division (Study Series SSM2011-01). U.S. Census Bureau.

In another example, respondents looked back and forth between the two input options, trying to figure out how the sections worked together, as shown in the fixation gaze plot in Figure 2.7. They did not read the instructions above the questions, which explained how to use the sections.

Respondents are often completely reliant on the visual layout and design when deciding how to navigate through a survey. Therefore, even more attention must be paid to the design of the survey and how it affects the way that respondents will navigate through it.

To illustrate this, imagine a series of survey questions presented on a screen in a two-column format, as shown in Figure 2.8A. Respondents rely on visual cues to determine how to navigate survey instruments, but with a two-column format, the navigation pathway is unclear. Since it is not immediately clear in what order a respondent should answer the questions, a respondent might take any of several pathways:

Answer all questions on the first row, then go down to the second row, and so on, as shown in Figure 2.8B.
Answer all questions in the first column, then answer those in the second column, as shown in Figure 2.8C.
Miss the second column altogether and only answer questions in the first column, as shown in Figure 2.8D.

Figure 2.8—Respondents relied on visual cues to navigate survey

Questions in a single column, as in Figure 2.9, are often easier for respondents to navigate correctly. However, items that typically go together such as state and ZIP code can still appear side by side.

Figure 2.9—A single-column layout facilitates navigation.

With paper surveys, usability issues are often related to navigational cues and instructions for skip logic. Many design considerations have been shown to improve usability of navigating skip logic on paper surveys—including using prominent question-numbering and using multiple visual design elements to emphasize skip patterns (Dillman, Smyth, & Christian, 2009).

On Web-based surveys, navigational cues and instructions can be problematic too. For example, in a usability study for a Web-based diary (Gareau, Ashenfelter, & Malakoff, 2013), participants were confused about the functionality of the Save and Submit button, shown in Figure 2.10. “Some participants clicked this button after every row of data entry, others clicked it after completing each section, and others clicked it intermittently as it occurred to them.” Some participants expected to move to the next tab in the diary when they clicked it, and most participants were not sure how to submit their data when they were finished.

Use of Save and Submit buttons unclear to participants. — Figure 2.10—Use of **Save and Submit** button unclear to participants.

Image source: Reproduced with permission from M. Gareau, K. Ashenfelter, & L. Malakoff (2013). Full Report for Round 1 of Usability Testing for the Consumer Expenditure Web Diary Survey. Center for Survey Measurement, Research and Methodology Directorate (Study Series 2013-24). U.S. Census Bureau.

With Web-based surveys, most skip logic is automatic. The survey program will only show respondents questions that they are required to answer. But, respondents can still face other navigational challenges. For example, a lot of research has been conducted on a basic navigational feature—the placement of the Next and Previous buttons. Some researchers argue that because the Next button is used more frequently, it should be to the left of the Previous button so that it is seen first and will be easier for users to reach (Couper, Baker, & Mechling, 2011; Dillman et al., 2009).

However, in a recent usability and eye-tracking study, Romano Bergstrom, Lakhe, and Erdman (2016) found that participants preferred the Next button to the right of the Previous button, and participants rated their survey experience more satisfactorily when Next was on the right. Eye-tracking data in this study showed that when the Previous button instead was on the right, it resulted in more fixations. This suggests that respondents were not expecting the Previous button to be in that location, and it took them longer to process the navigation. This is consistent with Couper et al. (2011) who found that the Previous button was clicked more when it was on the right. Since most respondents rarely use the Previous button to back up, more clicks are associated with worse usability.

These studies reveal that respondents expect to navigate through a Web-based survey the same way they navigate through other Web products—such as email, Web sites, and browsers—and everyday items like remote controls and phones. Respondents generally expect that right means Next and left means Back because this is consistent with general Web navigation features as well as everyday items, as shown in Figure 2.11.

Figure 2.11—Navigating through Web surveys just like other Web sites

In another usability and eye-tracking study, Bristol, Romano Bergstrom, and Link (2014) found that respondents easily understood the navigation on the mobile versions of a survey, but they had difficulty following the navigation on the desktop version, shown at the bottom of Figure 2.12. On the mobile versions, the Next button was at the bottom of the screen. This matched users’ mental model of moving down the screen as they answered questions, then continuing down the screen to click Next. On the desktop version, the Next button was located in the upper right of the screen. Respondents needed to move down the screen to answer questions, then go back up to the top of the screen to click Next to advance. In the study, people looked around the screen and pressed Go numerous times, verbalizing that the survey was not working properly, before realizing the Next button was at the top of the screen. The Go button was toward the bottom of the screen, after the response, and this placement matched their mental model of where the forward navigation button should have been.

Respondents had difficulty finding Next on the desktop. — Figure 2.12—Respondents had difficulty finding **Next** on the desktop.

Image source: Reproduced with permission from K. Bristol, J. Romano Bergstrom, & M. Link (2014). “Eye Tracking the User Experience of a Smartphone and Web Data Collection Tool.” In paper presentation at the AAPOR conference, Anaheim, CA, May 2014.

To the extent possible, we should design navigation features to match respondent’s expectations of how these features should work.

Even when navigational features do match respondent’s expectations, we must evaluate the potential effect navigation can have on other aspects of the survey. For example, Cook, Sembajwe, and Geisen (2011) tested a mobile survey that mimicked the finger-swiping motion commonly used to navigate through Web pages on a mobile phone. Participants could use their finger in a swiping motion on the phone screen to navigate between survey questions.

Participants were familiar with the swiping function and enjoyed swiping between questions. Yet, when participants swiped to navigate, occasionally they inadvertently changed their response to a survey question, particularly on check boxes. When swiping over a question response, as shown at the left in Figure 2.13—instead of swiping on whitespace, as shown at the right—the initial placement of their finger to initiate the swiping motion registered as a selection, inadvertently changing the participant’s response. Some participants did not even realize that they had changed their response to the survey question, leading to inaccurate responses.

Figure 2.13—Swiping sometimes inadvertently changed a response.

Processing Feedback

How Do Respondents Interpret and React to the Survey Feedback in Response to Their Actions?

With interviewer-administered surveys, the interviewer provides feedback to the respondent. If an invalid response is provided, the interviewer will likely repeat the question or response options in order to get an accurate response. With self-administered Web-based surveys, the computer or mobile device provides feedback to the respondent. This is one of the main distinctions between assessing usability of a paper survey and a Web-based survey.

When respondents complete a Web-based survey, they expect the survey to react to any action they take. For example, if they enter an answer and hit Next, they expect to be taken to a new page. If respondents click a button or link, they expect something to happen in response. Respondents must then interpret that reaction to decide on their next action. If they are taken to a new question, they will answer that question. Instead, if the respondent remains on the same page after clicking Next, they must decide what needs to be done in order to move to the next screen.

In a well-designed survey, the feedback provided by the survey can help prevent errors. For example, it is common for respondents to accidentally miss a row in a grid question where several questions are grouped together in a table format. Several studies have found that providing dynamic feedback to survey respondents by graying out completed grid rows reduces item missing rates compared with traditional grids (Couper, Tourangeau, Conrad, & Zhang, 2013; Galesic, Tourangeau, Couper, & Conrad, 2007; Kaczmirek, 2011).

However, respondents may misinterpret the feedback provided by the survey. Figure 2.14 displays a survey in which respondents had to answer a question multiple times for different departments and programs. Due to the repetitive nature of the items and the limited screen space, rollover definitions were used instead of including definitions in the survey questions. During usability testing, participants rolled their mouse over a definition or clicked it. But, when the survey did not provide immediate feedback to the action, several participants moved the mouse away assuming that a definition was not available. It turned out that the developer had programmed a 1-second delay into the rollover definitions to prevent them from coming up when they were not wanted. So respondents did not immediately realize that certain terms had associated definitions.

Figure 2.14—Respondents thought there were no definitions.

As a result, several participants in the study did not realize that the survey included rollover definitions. Their mental models were correct in that moving the mouse over the term would produce a definition. However the delayed feedback provided by the survey changed their interpretation. By the next round of testing, the delay for the rollover definitions had been removed; participants successfully noticed, used, and relied on the definitions.

How Well Does the Survey Help Respondents Identify, Interpret, and Resolve Errors?

Ideally, surveys have been designed in ways that prevent errors. We can do this by designing effective Web-survey questions (Couper, 2008), providing informative feedback to respondents, and preventing common Web errors by adhering to usability guidelines, such as those proposed by Quesenbery (2001):

Make it hard for respondents to perform actions that are
- incorrect
- invalid or
- irreversible

Plan for respondents to do the unexpected.

Despite our best designs, however, respondents may still encounter errors. When observing errors in usability tests, it is helpful to examine not only the cause of the error, but also whether and how the respondent was able to recover from the error. Although errors due to poor design can be prevented to some extent, errors such as typos and accidentally skipping a question are not as easy to prevent but must be easily resolvable for respondents. A usable survey is one that is error-tolerant, meaning that the survey is primarily designed to prevent, and help respondents recover from, errors.

In an interviewer-administered survey, if a respondent provides an invalid answer, the interviewer can alert the respondent that the answer was invalid. The interviewer will then try to gain an appropriate answer from the respondent by repeating the survey question or the response options, or by prompting the respondent, “So would that be a yes or no?”

With a self-administered Web-based survey, the first step in providing feedback to respondents through a Web-based survey is to let them know that an error has occurred. For example, a respondent may inadvertently type the age as 422 instead of 42. If the survey does not notify the respondent that 422 as an invalid age, it is unlikely the respondent will realize the error.

Once an error is identified, the survey should adequately describe the error so the respondent can identify it. To help respondents interpret error messages, the messages should be positive, helpful, and near the problematic item.

Figure 2.15 shows a typical error message from a mobile-phone survey on the left. The pop-up box notifies the respondent of an error, but it does not help the respondent to determine what caused it. The error message is general, does not indicate which field is missing, and is not located next to the missed item. The respondent may not know what fields are required. In this usability study, participants saw the error message and had to look all over the entire screen to find the missing field and correct it, as shown in the eye-tracking gaze plot at the right, in Figure 2.15.

Error message doesn't indicate what field is blank. — Figure 2.15—Error message doesn’t indicate what field is blank.

Image source: Reproduced with permission from J. He, C. Siu, J. Strohl, & B. Chaparro (2014). “Mobile.” In R. Bergstrom & A. Schall (Eds.), Eye Tracking in User Experience Design. San Francisco, CA: Morgan Kaufmann.

In Sample Chapters | Usability Testing | User Research