Even though computers are controlling more and more of the world, they are not always getting smarter. Oh, they’re becoming more sophisticated, but humans must make computer code smart, and we don’t always get things right. It doesn’t help that we’re using old, ad hoc methods of planning, design, and analysis.
It’s scary that we sometimes don’t know why artificial intelligence (AI) systems work. But we should be even more worried that pretty much every system we use—every app, every device—is now so complex that we cannot possibly predict all system behaviors.
Analyzing for Failure
When analysis gets done, the most typical method is still the use case. You come up with a set of preconditions, then determine the behavior of a system based on them. The problem is: complex systems have a huge number of use cases.
I once was in a meeting about a banking app, during which we kept bringing up more options for use cases, adding them to a list of use cases that we needed to write. I realized something about the system’s complexity and quickly created a matrix for some of the basic options on the whiteboard. When I did the math, I discovered that a complete use-case analysis would take about 300 billion years.
The system you’re designing probably isn’t any less complex than that one, so if you want a robust analysis, use-case analysis won’t work. Many other methods such as Failure Mode Effects Analysis (FMEA) are either too vague or have their roots in machine-era analysis. But such methods take a long time, and the timelines software-development projects now require won’t accommodate them. The analysis that is necessary for aircraft and bridge design can take years, but despite all that analysis, things still fall apart occasionally.
I wish I had a simple solution to this problem and could give you a checklist for solving it, but I don’t. Working with very smart people, I have not yet come up with a much better system of analysis. Instead, I just try to keep the principles of resilient design in mind and share my ideas and the lessons I learn with everyone.
If you want to help prevent a potential robot apocalypse, start by keeping in mind that a badly designed system can easily be stupid, rude, dangerous, or even deadly. Now, let’s look at some examples of badly designed systems and consider how design could have better addressed their problems.
All of us encounter stupid, inhuman systems and experiences every day. When systems perform hard tasks or machines are obviously not very smart, people don’t complain too much. But when machines fail at simple things or are supposedly very intelligent, but act in ways that are weird, bad, or inhuman, people find that annoying and don’t trust the products.
How can you identify inhuman design? Just think how weird it would be if a person did the same thing that a system’s user interface does. What if a receptionist asked for your email address, then asked again just in case. What if you decided not to buy a pair of pants, but were physically barred from leaving the store until you filled out a survey?
Just the other day, a woman in London noticed that the McDonald's order kiosk lets customers remove every part of the burger, so tried doing just that. She removed the bun and even the meat, then successfully paid for a McDonald's bag and a receipt, demonstrating this sort of system failure.
People who know I’m a UX designer often mention how good or how awful product suggestions sometimes are. Amazon is a good case study for this. The company is pretty good at most things they do, but that just makes any gaps in the system worse.
A few years ago, I bought a label maker after a significant amount of research. Naturally, I bought it through Amazon because they carry everything. But as soon as I put it in my cart and for months afterward, Amazon tried to sell me more label makers. Forget how unlikely it was that I’d want to own two or more label makers. These machines have consumables—the tapes that you use with them—and Amazon was suggesting label makers from other brands that took different tapes. At the same time, they did not try to sell me the label tapes that work with my label maker. I actually wanted to buy a variety of tapes, which come in different colors and degrees of weatherproofing, but it is a bit difficult for someone who is new to using these devices to find the right ones for their machine. Some help would have been nice.
You can see how Amazon got to where they are. They started as a bookstore, and those underpinnings leak out sometimes. Books don’t have accessories, but are sometimes parts of series, so readers would clearly want more books in a series—or more books on the same topic that other people liked.
Legacy systems are not inherently bad, but you cannot assume they’re working fine or as expected without examination. You need to watch how your system is working today, then make sure the basic data practices still apply and would work well for all cases and users.
I spend a lot of time writing parsing rules—for example, so old databases can display all caps the way humans write, instead of yelling. But, sometimes, I get a client to see the light and rewrite the database to support modern methods, better indexing, and full UTF-8, so a system seamlessly supports multiple languages.
Amazon now sells enough different types of products that they should be building generalized systems and trying to root out things such as all instances of referring to a title instead of a product. Then, they could not only support their customers better by suggesting the products and accessories they actually need, but could also better address their own organizational needs by making suggestions that drive even more sales, improve sell-through rates, and satisfy customers who then return.
McDonald's could easily allow customers to remove any part of a hamburger, but have a rule requiring at least one item to be selected. This would add a bit more complexity, but should prevent someone’s ordering just ketchup, while providing a more useful, sensible system. Formal analysis would reveal such issues, as would simply thinking about the logic of ordering and delivery, taking into account the fact that humans need food.
Everyone knows by now how rude error messages can be. Rude messages result when companies build systems from an engineering point of view, assuming that user interactions are the problem rather than the goal of most systems. Every day, we encounter terrible error messages as we try to use apps and Web sites. Messages yell at us for not typing our phone number with hyphens or insist that the name of our childhood pet is too short.
But what about error behaviors—when systems do things for us improperly or actively work against us?
In the early 2000s, long before Nest’s learning thermostat came out, I installed a connected, smart thermostat in my house. It actually worked pretty well, but one day, I came downstairs and, instead of displaying a temperature, it said LO. What does that mean? Well, I can guess that it means low, which in systems like this is shorthand for off-scale low.
When a reading is either off-scale low or high, it is simply outside the range that a system was designed to measure or which an instrument can measure. But my house was clearly not colder than the measurement range of a thermocouple, so again, my question was: what does Lo mean? Something was clearly wrong and—worse than the message—was the fact that the system was doing something about it that I hadn’t asked it to do.
After some investigation using my multimeter and talking the manufacturer, we figured out something internal had broken, so the thermometer was occasionally getting no temperature reading. It read that bad data as very, very cold. But instead of just warning me, it was switching modes and turning on the heat at maximum power, in the middle of the summer.
The manufacturer of my smart thermostat thought this was an electronic error, but it’s also a design error. Who decided that the thermostat is smarter than the user? I am requesting cooling mode, so why should it switch to heat mode all by itself?
Failures occur in real life, so what about the case of a spurious data sensor? Instead of freaking out about a rare, out-of-range measurement and taking a catastrophic action, the thermostat should realize that this sudden shift is impossible, disregard the bad data, and carry on doing its job properly.
Extending the principle of avoiding errors is how we can ultimately get to radical future products such as self-driving cars. While a few automakers are being more obvious about it, there are bits of self-driving capabilities in many new cars. They’re just sold as a bunch of individual safety features. Consider the overlay on the display from the back-up camera that says how far away obstructions are. That is a form of augmented reality that is so functional and seamless, we just use it instead of labeling it.
In some vehicles, automatic lanekeeping is brilliantly well done. There’s no shrill alarm. Instead the car simply steers for you—so subtly that you may not even notice it—and keeps you on the road.
Computers often work best when they stay in the background. The politest technical solution is usually the most automatic, subtle, and quietest solution.
My thermostat’s error was merely inconvenient, but bad systems are sometimes way beyond inconvenient, putting people at risk and costing them—or their insurance companies—money by destroying property.
In 1996, the European Space Agency launched their new Arianne 5 rocket. Well, it wasn’t entirely new. It was really an evolution of their previous rockets, but it was a major leap forward, lifting more load, much higher and faster. So it was a major disappointment when the first rocket they launched exploded 37 seconds after liftoff.
It turned out that the cause of the explosion was a bit of legacy code that they had reused for a simple part of the system, without properly testing it. The faster rocket exceeded the limits of the old software. Just as my thermostat decided that anything out-of-bounds cold was bad and turned on the heater, in a similar case, the rocket’s system initiated a self-destruct.
Last month, I was driving through the middle of nowhere in Illinois when my car started acting funny. I kept checking the instruments, but saw nothing troubling for a while. Then, suddenly, the oil light came on. That’s bad, so I slowed down, signaled, and pulled off the road. Then, the car died.
The engine killed itself due to a blown oil line, and the car would never run again. I donated it to a nearby NPR station. Further investigation of the failure revealed that the oil warning light on this car triggers at 8 psi, but normal operating pressure is more like 38 psi. Just 2 psi lower than normal pressure is bad. In other words, the warning light was designed to tell users the engine is already ruined, and there’s nothing you can do about it.
Design your systems to be smarter and use valid data whenever it’s available, instead of going into failure mode.
The Arianne 5 fell into the same trap Amazon had, in assuming a legacy system would keep working forever. Even so, the code could have been designed not to consider out-of-bounds conditions as errors—truly fatal errors—but as exceptions, then handle exceptions as best they can. The rocket failed so badly that it overrode good systems reporting valid data.
It is your job, as a UX designer, to make sure that all conditions—both normal and exceptional—consider the human factor. If no one else has clearly defined the useful limits of a system to protect users, their actions, and their property, you must make sure that happens.
Often, such solutions emerge quite naturally. Designing the part of a system that displays a low-oil warning, you get to pick the icon, color, and position based on standards. For any error—whether an annunciator light on a panel or a pop-up dialog box on a Web site—I always ask:
What is the actual condition the system is representing?
Does the user need to know about it?
What can the user do to respond to it, correcting the issue?
I have designed systems exactly like that low-oil warning light, and my human-centric questions led to better solutions. The highest-priority information appears when you have 30 seconds or less to safely shut down a system. Add the time to retrieve the message and make sense of it, and call that 20 seconds. Ask: “At highway speeds, can the operator pull over the vehicle and shut it down within 20 seconds?” Probably not, so maybe the system should warn the driver more than 30 seconds in advance. While we’re at it, we should add a clear Stop Now label—translated to the local language—and an icon that’s a hand in a stop sign instead of just a general warning icon.
Design controls to operate only when using them would effective and safe. Design data and warnings to display only when they’re important and relevant, and display them in time so the user can do something about the problem.
Having your car break down on the side of the road is very inconvenient—perhaps even dangerous. But there are design failures that actually risk life and limb.
In 2015, a Scaled Composited/Virgin Galactic SpaceShip2 broke apart in flight, killing one test pilot and seriously injuring another, just because the copilot pulled a lever at the wrong time. The NTSB investigation found that no one had paid attention to human factors—in the training, in the design or function of the controls, in crew procedures, in ergonomics, or even in checklists for safety equipment.
It wasn’t the fault of the crew member for pulling the lever at the wrong time, but of a design failure that required many closely timed manual procedures and allowed the pilot to pull the lever during a phase in a flight when it was inherently dangerous to do so.
In March of 2016, an automated shuttle train that moves passengers between terminals at Denver International Airport (DIA) suddenly accelerated from its normal speed of 8 mph to 22 mph, then regained its composure and stopped. As a result of these sudden movements, the passengers were thrown around, sending four—including a child—to the hospital and injuring 27 others.
Digital systems are very safe from engineering failures. This wasn’t some random short. However, these systems are subject to human coding errors. Someone simply entered the wrong speed for the track segment during a maintenance cycle, and the train obeyed its commands.
Basic systems analysis reveals the limits of safe and suitable operation. Systems do not exist for their own sake, but to support the needs and goals of people—for example, to make lifesaving equipment operational—or to achieve a company’s mission.
Space Ship One experienced many human factors and usability failures, but two stick out. First, overloading users by requiring them to perform many discrete tasks, in order, is always a recipe for mistakes. Requiring these tasks to occur at very specific times compounds the problem. This is relevant not just to cockpit design, but to systems design in general. To avoid requiring users to perform so many discrete tasks, try grouping and automating certain tasks.
Even if we assume Space Ship One needed to have manual controls, a risk analysis should have included human factors. Scaled Composites considered mechanical failures, but never considered that a person could make a mistake. Lights and gauges indicated to the crew when it was time to pull the lever, so the system could also have had lockouts, enabling the control only when conditions were safe to use it.
The DIA shuttle carries people. That’s really the only job it performs, so very far up in the system constraints should be: do not kill or injure passengers. An automated train should never be able to accelerate or decelerate to speeds that would injure people, even in an emergency.
There is probably a guideline in the manual telling the maintenance crew not to use such settings, but implementing such constraints in the application would be better. The project team should have carefully added speed and acceleration boundaries, so the train or the Arianne 5 would ignore out-of-bounds data or spurious information due to failure, as I described for the thermostat.
People Are the Intelligent Part of the System
Designing in intelligence is the role of UX designers. I keep reading articles about improving system user experiences and injecting humanity into our designs, which then go on to focus entirely on material design or confuse the lessons of some science-fiction interface with the meaning and function of a system.
We should design not just screens, but the broader user experiences of systems. UX designers must design data structures, methods of reducing the display of errors and minimizing delays, and behaviors that occur when there is a variable or no network connection. We must design solutions for cases when systems, components, and users do the unexpected. No one else is going to act as a representative for the users, focusing on meeting their goals and keeping them safe.
There’s a lot of conversation about artificial intelligence these days, which just feels like an extension of Big Data and the other algorithmic decision-making trends of the past few years. These technologies may indeed change the world, but the question is whether they’ll change it for better or worse.
The one true intelligence behind any system is the team building the product. Teams without human factors or UX professionals build inhuman, dangerous systems. As a UX designer, your choices define the sensibility and morality of a system. They establish boundaries of logic and safety. UX design provides any system’s humanity. We build systems so people can interact with them safely and effectively.
For his entire 15-year design career, Steven has been documenting design process. He started designing for mobile full time in 2007 when he joined Little Springs Design. Steven’s publications include Designing by Drawing: A Practical Guide to Creating Usable Interactive Design, the O’Reilly book Designing Mobile Interfaces, and an extensive Web site providing mobile design resources to support his book. Steven has led projects on security, account management, content distribution, and communications services for numerous products, in domains ranging from construction supplies to hospital record-keeping. His mobile work has included the design of browsers, ereaders, search, Near Field Communication (NFC), mobile banking, data communications, location services, and operating system overlays. Steven spent eight years with the US mobile operator Sprint and has also worked with AT&T, Qualcomm, Samsung, Skyfire, Bitstream, VivoTech, The Weather Channel, Bank Midwest, IGLTA, Lowe’s, and Hallmark Cards. He runs his own interactive design studio at 4ourth Mobile. Read More