Book Review: Designing Voice User Interfaces

December 17, 2018

Cover: Designing Voice User InterfacesOne of the most common questions I hear when talking with students, prospective students, colleagues, and clients is: what are the emerging trends in User Experience? Most of the time, people seem to be looking for specific product predictions: What is the next device that Apple, Amazon, or Google will roll out? Or maybe, what is the next big social-media trend?

While predicting specific disruptive innovations can be challenging—and forecasting their precise timing is even more difficult—it is safe to assume that the technology that mediates our interactions with information will become less obstructive and eventually recede into our environment. This may seem obvious to people who have read Adam Greenfield’s Everyware or Peter Morville’s Ambient Findability. Plus, we have been experiencing this shift for some time—longer than many realize.

Champion Advertisement
Continue Reading…

While in the early days, UX design borrowed from graphic design and much UX design work still looks similar to traditional graphic-design deliverables, the future will be much less visual. I tend to think this shift is a good thing. Too often, people conflate pleasant visual design with UX design, saying, “User experience is a type of Web design.” Or: “The selection of colors and type is UX design.” In a future when UX design looks much less like visual design, we will be able to focus more on what User Experience is really about: designing tools and processes that work for people.

Over the last few years, it has been interesting to observe how Siri, Cortana, Alexa, Bixby, and Google have become part of our lives. But what I see missing from today’s curriculum for UX education is a focus on alternative interface design. We have a fairly well-established language for describing interactions with visual interfaces, as well as for information architecture, but standards for speech, virtual reality (VR), and augmented reality (AR) user interfaces still seem to be lacking.

It was the realization that voice is here to stay that prompted me to begin reading Cathy Pearl’s book, Designing Voice User Interfaces: Principles of Conversational Experiences.

Book Specifications

Title: Designing Voice User Interfaces

Author: Cathy Pearl

Formats: Paperback, Kindle, ebook

Publisher: O’Reilly Media, Inc.

Published: 2016, 1st edition

Pages: 278

ISBN-10: 9781491955413

ISBN-13: 978-1491955413

Historical Context

With all the attention speech recognition and voice user interfaces (VUIs) have received recently, it’s important to note that the technology has actually been around for quite a while. We’ve all interacted with interactive voice response (IVR) systems on our phone. These systems were the precursors to Siri, Alexa, and today’s other voice interfaces. Often, the same companies developed the underlying technology for voice interfaces.

Although we can certainly point out the deficiencies in today’s voice-recognition appliances, Pearl points out that similar foibles of phone systems have served as fodder for comedy writers in the past.

Digital Assistants, Personality, and Avatars

While Pearl’s book describes voice interactions, the real core of the voice interface is conversation. A speech-recognition system is simply a user interface that mediates the conversation. Thus, there has been considerable discussion of how an assistive user interface should work with the user.

When designing a voice user interface, you must design—and predict—user’s questions and the answers to them. This is where the art and knowledge of how to craft an interview—something with which UX professionals are familiar—come into play. Thinking of spoken phrasings or words as a user interface draws on our skills for developing well-scoped phrases that prompt the desired action from users.

In a voice user interface—or, more accurately, a conversation-based interface—understanding timing, call, and response is even more critical than in a typical visual user interface. The development of a workflow that really demonstrates an understanding of a user’s task is critical because there is often no visual reference.

When an information system presents itself as an assistant or some other entity that actively responds to our queries, it is natural for humans to assign a personality or even a gender to that system. Voice user interfaces encourage this phenomenon.


One thing that impressed me when the Amazon Echo was introduced was the feedback I had seen from people with diminished motor skills. The Echo’s voice user interface—when paired with a smart-home hub—could give a quadriplegic person the ability to control their home environment simply by speaking. Pearl’s book includes accessibility concepts and best practices and provides relevant examples.

Don’t Forget About the GUI

While voice user interfaces often get the most attention from consumers, most if not all of our modern user interfaces are multimodal user interfaces. Pearl advocates for a holistic approach to designing the experience. It’s important to design and test the graphic user interface (GUI) in concert with the voice user interface. Frequently the graphic user interface delivers the output for a query. For example, if the user asks Siri for directions to a particular landmark, they’ll appear in Apple Maps.

Survey of Voice User Interfaces

Designing Voice User Interfaces provides a great overview of the components of which you should be aware when designing voice user interfaces. One key step in learning about new technologies or systems is simply learning the vocabulary to use in describing the key components of that technology. The book describes the voice technologies that are currently available, starting with an overview of Automated Speech Recognitions (ASR) systems. But technology is just one part of the VUI puzzle.

It is also essential to understand the conventions for different types of interactions. While modern smartphones introduced swipes, pinches/zooms, and other gestures, concepts such as barge-in, end-of-speech, and no-speech timeouts are at play in voice systems. Knowing what they are and how and when to apply them is critical to delivering successful experiences.

Precise understanding of users is always vital, but perhaps even more so for voice interfaces. Pearl provides an example: When designing a voice interface that asks the user to provide an account number, the designer must have the insight and empathy to realize the user might not have that information readily available. Therefore, you must consider timeouts and provide alternative paths to identify the user.

Cognitive Load

It is vital to realize that voice user interfaces—whether in the form of IVRs or modern assistants—can overwhelm people’s working memory. How often have you stopped paying attention to an IVR when it exceeded three or more options? Pearl provides best practices and recommendations on how to design navigation systems so users do not become frustrated by the interface.


I’ve noticed people’s tendency to assume that a new technology means well-established principles no longer apply. I saw this when the iPhone was introduced and again when the iPad launched. One of the things I appreciate most about Pearl’s book is that it clearly demonstrates that, while the underlying technologies for our tools may vary, the actual process for delivering good user experiences does not. The skills and processes that we use as UX professionals are independent of technology or fads.

One fine example of Pearl’s advice is to ensure that you test voice user interfaces with users. As I read her description of how to do this, I was reminded that paper prototyping and think-aloud protocols are still applicable—even in our newest technology products.

Designing Voice User Interfaces provides a good introduction to designing voice interactions. Plus, I definitely learned a lot about the history of VUIs, the technologies that support them, and the design tenets for these interfaces. If you are already familiar with UX design and research skills and competencies, this book should provide some easily transferable knowledge. 

Owner and Principal Consultant at Covalent Studio LLC

Akron, Ohio, USA

D. Ben WoodsBen’s global design and technology firm specializes in software design and development for the Web, mobile, and ecommerce. The company serves clients ranging from small startups to some of the largest companies in the world, including General Electric, Rio Tinto, and Fidelity. His career in User Experience began in the late 1990s. Ben has held diverse roles, including UX management at a global B2B firm, full-time and part-time academia, and executive roles. He enjoys solving complex business problems and coaching talent to be competitive UX design professionals. Ben earned his MS in Information Architecture and Knowledge Management at Kent State University and is a graduate of the Executive MBA program at Case Western Reserve University’s Weatherhead School of Management. He has presented long-format talks, speed presentations, and posters at many conferences and events and has conducted training and workshops for organizations throughout the United States, Europe, and Asia.  Read More

Other Articles on Book Reviews

New on UXmatters