Many brands have explored voice user interfaces (VUIs) by creating some kind of skill for a smart speaker. While that’s a good start, it’s not the optimal way of leveraging voice capabilities for the future.
Instead, you should add voice to your existing apps to complement their current touch user interface. Early adopters such as Spotify have taken this approach and have received universal praise for their voice functionality. Spotify users can keep on using the app just as they’ve always used it, with its traditional touch user interface. Plus, they can use voice commands to control the media player. This is how you should use voice.
Rather than creating VUIs as replacements for your applications’ current user interfaces—as for Google Home or Alexa—create voice capabilities that provide a complementary user-interaction modality for your current user interface.
Of course, this route is a bit harder for you to take because it requires actual development skills and a well-thought-out user-interface design. Nevertheless, it’s the only way brands can take full advantage of voice and have full control over the user experience. In this article, I’ll discuss six reasons why adding voice to your existing app is the best way to leverage voice.
1. Owning the User Experience
While using existing platforms such as Alexa or Google Home, in theory, gives brands access to an existing user base, skill discovery is still a big issue on most smart-speaker platforms. It doesn’t matter if your skill could be useful to millions of people if no one can actually find out about it.
Plus, even if you could build your brand experience on top of a smart-speaker platform, would you really want your audience to begin every interaction with that experience by saying “Hey Google” or “Hey Alexa”? Or would you rather keep your users in your own channels for the whole duration of the experience?
Another big issue with smart-speaker platforms is their having ever-changing rules. If you build a great skill and people start using it, might you be helping the tech giants who own these platforms to collect the data that your skill generates, then use it to build an even better skill that’s even easier to use? You can’t know before investing in designing and building it. The only way to really own the user experience is to build it yourself.
2. Learning from Your Customers
One great benefit of using voice is that it lets you learn about what your users want your app to do—beyond what they can already do with it.
Let’s say you have a mobile app. Using any analytics tool, you can collect data about which of your app’s features your users use and what buttons they click. But what about the features they’re looking for, but can’t find? There’s no data about those.
If you’ve created a voice user interface, you can collect data about what new features your users want. If your users ask for things your app doesn’t support, you can get great insights regarding both what features they want and the language they use when looking for them. This is one of the best benefits of voice capabilities that app designers rarely discuss. But, if you were to build your app on a smart-speaker platform, the provider of that platform would own this data.
3. Making Your App Easier to Use
Many common user-interface tasks are tedious and require a lot of clicking or tapping. Something as simple as filling out an online form could require a hundred taps. What if you were to support voice input that understands what your users want to do?
For example, if a user of your travel-booking app could say “Show me flights from New York to Boston departing next Friday,” would this be easier than the user’s first having to select a From field, then type New Y and select New York from an autocomplete list; then select a To field, then type Bost and select Boston from a list; then, finally, move onto a different interaction to select the date?
While some brands have experimented with speech-to-text solutions that complement their Web forms, that approach is not really feasible. A good voice user interface must have natural-language capabilities. Unfortunately, the cross-browser support for most natural-language solutions is not very good.
4. Saving Precious Screen Real Estate
One issue with current touch-screen user interfaces is that users can click only the buttons that appear on the screen. Therefore, users must wade through nested menus to find the features they rarely need.
This is not the case when using a voice user interface. Users can access all of the capabilities of your app from every screen. Imagine that you have an ecommerce app. What if a user has already gotten halfway through the checkout process, then remembers he hasn’t added AA batteries to his cart. Using a traditional user interface, the user would have to go back to the store and find the batteries, then start the checkout process all over again. In the worst-case scenario, the user might change the store at which he’s shopping to complete the transaction.
With a voice user interface, the user could just say, “Add AA batteries to my cart,” then just continue the checkout process. Plus, there would be no need to go back and forth between typing and selecting on a small touch screen.
5. Supporting Your Users’ Names for Things
A touch-screen user interface requires a button for each feature. Your users obviously cannot select a category that is not available on a menu. Therefore, you would need to decide for them whether they would find T-shirts under T-shirts, Tees, Turtlenecks, or Shirts.
With voice user interfaces, it’s not necessary for users to guess what terminology you’re using for what they want. For example, regardless of whether users think about changing their profile picture as “Change my image” or “Set up my profile,” they can get to that feature by saying either command. This makes your users immediately feel familiar with your user interface and helps them find what they’re looking for.
6. Reducing Users’ Need to Type
People can speak at a rate of about 130 words per minute. However, most people can type only about 50 words per minute on a desktop keyboard and only about 35 words per minute on a mobile device. Therefore, typical users are up to five times more productive when they can speak commands or dictate text and use voice recognition instead of typing.
Even though automatic speech recognition is not perfect, it’s significantly faster than typing in most cases. Plus, the error rate for speech may be lower than that for typing on a touch-screen device.
In general, you should think of voice as a new mode of interaction for your current touch user interface—not as something totally new that would revolutionize the user experience to which your users are already accustomed. Think of voice not as a user interface, but as a modality.
While, today, most of the hype around voice user interfaces is around smart speakers and voice assistants, making separate touch-only and voice-only apps makes creating a unified user experience impossible. To create an optimal user experience, you should start thinking about adding voice capabilities to your current mobile app.
Ottomatias has 15 years of professional experience, working with fast-growing startups and communications agencies. Over the past few years, he has worked for companies such as Beddit, which Apple acquired, and Leadfeeder, which is among the fastest growing SaaS (Software as a Service) companies in Europe. He also served as a Special Advisor for the Minister. He’s currently enabling developers to build awesome user experiences with voice and artificial-intelligence (AI) capabilities at Speechly, a leading voice startup.