Playing Well with Others: Design Principles for Social Augmented Experiences
Published: March 8, 2010
As the recent launches of Google Goggles (see Figure 1), Bing Maps (see Figure 2), Junaio, and the Unifeye SDK have demonstrated, technical barriers to delivering augmented reality (AR) experiences on a broad scale are falling rapidly. Separate advances in technologies for practical and commercial-scale, cloud-based speech and language processing; real-time search; computer vision; accurate geolocation and device awareness; AR commerce and development platforms; as well as high-bandwidth, sensor-enhanced mobile devices are coming together to form a first-generation infrastructure for augmented reality.
Figure 1—Google Goggles
Figure 2—Bing Maps augmented reality demo
With the exotic, mixed realities that futurists and science-fiction writers have envisioned seemingly just around the corner, it is time to move beyond questions of technical feasibility to consider the value and impact of turning the realities of everyday social settings and experiences inside out. As with all new technologies as they move from the stage of technical probe to social probe, this AR transformation will happen case by case and context by context, involving many factors beyond the direct reach of UX design. However, as a result of the inherently social nature of augmented reality, we can be sure the value and impact of many augmented experiences depends in large part on how effectively they integrate the social dimensions of real-world settings, in real time.
Psychologists describe people’s ability to engage in and adapt to social situations and interactions well as social maturity or social intelligence, distinguishing social maturity from other aspects of maturity such as age, intelligence, and emotional acuity. I believe we can use the idea of social maturity to help frame and design the coming wave of social augmented experiences. In an earlier column on UXmatters, “Inside Out: Interaction Design for Augmented Reality,” I said that the majority of then-current AR experiences relied on a small set of interaction patterns, limiting their possibilities. I also suggested that missing patterns or antipatterns—such as Loner, Tunnel Vision, and Secondhand Smoke—lower the potential value and relevance of AR experiences. In combination, these AR interaction patterns and antipatterns can sometimes result in antisocial experiences and behavior within social settings. Or, using our borrowed metric, AR experiences often show fairly low levels of social maturity or intelligence.
Since broad-scale augmented reality is still in its pioneer stage, I think it’s more constructive to say that the social maturity of current augmented experiences is similar to that of a young child who is learning the complex rules and norms that determine socially acceptable behavior. With unevenly developed abilities and understanding, fitting into social situations is very difficult. Augmented reality is experiencing similar challenges. This is no slight. The social aspects of many more mature media are still neither fully evolved nor understood. However, it does mean that there is much ground to cover for augmented reality to reach social maturity. Looking ahead, there are two possibilities for the future of augmented reality:
- Augmented reality could mature socially and develop a proper understanding of social dynamics, enabling the creation of AR experiences that can successfully integrate with the social sphere—thus, providing truly social augmented experiences—and accelerating the growth and relevance of augmented reality.
- Without substantial social integration, augmented reality might remain restricted to a class of specialized utilities that are better suited for focused, asocial or semisocial activities like technical reference—one of the primary applications of augmented reality from the beginning.
Once we can look back on this period in the evolution of augmented reality from the future, in retrospect, perhaps we’ll perceive the question of social maturity as an inflection point in the overall development of AR experiences.
One example of a newly feasible AR capability for mobile devices that has substantial social dimensions is facial recognition—using computer vision to identify the people in a camera’s field of view, then presenting information about them. Recently, Google launched Google Goggles—which uses their very large product, image, location, and price databases to identify items like books and CDs—with its latent facial-recognition capabilities deliberately disabled. At the moment, Google believes it is exhibiting greater social maturity by publicly acknowledging the newly possible interactions, but not supporting them.
On the other hand, Comverse’s Face.com and TAT’s Recognizr application, shown in Figure 3, are wholly based on facial recognition and allow the quick retrieval of linked information such as social-network profiles, status updates, and shared content. While both Face.com and Recognizr are opt-in services and offer the ability to manage both your participation and your image’s associated profile information, their value premise is dependent on the very same ability Google has chosen to disable.
Figure 3—TAT’s Recognizr
We can understand the business reasons for both decisions, but simply turning a capability on or off fails to address the contextual layers and nuances of social and emotional settings. Nor does it present the opportunity to learn from and refine the interactions augmented reality allows us to design. What, though, does socially mature mean for facial recognition?
Let’s consider some of the AR facial-recognition experiences coming to market, calling attention to the interactions, relationships, and contextual models we must consider when designing a new technology we’re about to introduce into complex, living-and-breathing social settings.
The core usage scenario for current AR facial-recognition product offerings goes something like this: A user, holding a smartphone or other device camera, focuses on a person’s face, in close view, long enough to register and recognize the person’s facial image, then waits while the service interrogates that person’s digital identity profile—which he or she has possibly defined, but may be ad-hoc and aggregated. The user then lowers the phone and visually scans the profile or other information the facial-recognition application has retrieved, perhaps choosing to focus on a specific pool of Facebook or another service’s shared-lifestream content, perusing it to get a sense of a person’s status, interests, and usual activities. Finally, the user may decide to engage in conversation with that person—or may perhaps choose not to do so.
The contextual model for such an interaction assumes that interrupted or delayed conversations are socially acceptable—though asynchrony is one of the most frustrating aspects of poorly mediated interactions. (‘Can you hear me now…?’) This scenario also depends on the directed, gaze-based, device-powered scanning of faces at close, personal distances. For strangers who have no affiliation, I’d wager this experience is too similar to having a policeman stop you and ask you for your identification for most people to consider it an acceptable social interaction. Even among people with existing, but weaker ties like the indirect relationships that loosely link colleagues in a large organization, this feels like a depersonalizing and inherently suspicious way of greeting someone. Using a device in this way also creates a physical barrier between people and shifts the focus of one’s attention away from the person to the device and the information it presents. This hardly feels like an improvement over the everyday act of introducing yourself or joining a conversation group at a party.
Contrast this AR experience with the interaction models and user experience of the facial-recognition capabilities of desktop photo-management tools like iPhoto or Web-based services like Picasa. By default, these applications are much further removed from live social contexts, relying on asynchronous augmentation that neither disrupts the flow of social interactions nor invokes the same level of social meaning. They offer similar capabilities, but their social impact is much lower.
To be fair, it’s true that AR facial recognition and profile data retrieval could feel appropriate or even enhance the value of a complete experience in many social settings—for example, at a mixer event or when playing a pervasive game, attending or making a public presentation, or entering a shared workspace as a new member for the first time. But how can we know when and how to augment social experiences?
Design Principles for Social Augmented Experiences
Shifting our focus from critique to creation, I’ll suggest some guiding principles for the design of AR experiences that engage with the social dimension in a mature fashion and would increase the value of turning reality inside out. User experience is their reference point, so these principles are neutral with regard to technology, device, or medium. These principles are by no means universal across all cultures and contexts, but their common focus is social interaction and utility. Applied judiciously, they can inform design efforts at all stages of progress and perhaps contribute to the evolving social maturity of AR experiences.
Default to the Human
Augmented reality and its bigger brother ubicomp, or everyware, make many new types of interactions and behaviors possible. The vast majority of these possibilities for augmented reality, however, simply would not be relevant to the way people socialize and interact, and some would be strange, unpleasant, or even harmful in certain contexts. When envisioning social augmented experiences, we should design interactions, behaviors, and situations that follow human norms and expectations by default.
Why? By their nature, people are well designed for social interaction: Our neuro-linguistic, cognitive, and sensory systems do a very good job of supporting synchronous, human-to-human interactions on both small and large scales. We already have amazing capabilities for face, voice, and name recognition; recognizing body-language cues; simultaneous verbal communication; and inferring people’s emotional states. Making human-to-human interaction demonstrably better by designing AR experiences that add new social modalities, using computerized versions of our existing social abilities is possible, but will require great delicacy. Sociality is one of the most fundamental indicators of humanness we use to understand ourselves and the world around us. Augmented reality has not yet taken us to the realm of the transhuman or posthuman. However, embedding nonhuman interactions in our social fabric would necessarily blur and stretch the markers for our common humanity.
For example, social practices for greeting friends and acquaintances vary tremendously across cultures and groups. But nowhere do people begin casual conversations by recounting every single one of a person’s actions or recent public statements verbatim. This is exactly the behavior that real-time search, facial recognition, lifestreaming, sensor-enhanced environments, and speech processing could facilitate. Similarly, administrative activities like exhaustively logging the locations, participants, and durations of all our face-to-face conversations for later analysis is not typical behavior in ordinary social interactions.
Although social norms could change to incorporate some of the new possibilities AR technology affords—for example, people are already checking in regularly with location-based services like Four Square while in the midst of a group of friends—we are still far from the point at which we’ll see using AR applications that name all people within your field of view and rank them in order of their algorithmically calculated influence within a set of aggregated social graphs as a natural activity when you enter a room.
Enhancement, Not Replacement
Augmented reality should enhance real-world social interactions and situations rather than directly replacing them. Think of the classic sight gag in Airplane II, shown in Figure 4. One character is operating an elaborate, wall-sized video communications console to contact his commander at what seems like a far-away location. However, in the middle of their conversation, the commander opens and walks through the video console, revealing it to be a simple door with an ordinary window. They’ve been talking through the window as though it were a live video link.
Figure 4—Replacement is not enhancement
Build Real Bridges
Socially situated technologies succeed when they enable people to overcome real barriers to interactions and relationships. The explosive global growth of mobile phones in the past 20 years is a good example of the value inherent in bridging the barriers of distance—or time or infrastructure constraints. Augmented interactions can likewise succeed by bridging gaps between people and extending their social reach, as long as the augmented elements themselves are relevant and valuable. We don’t want to build bridges to nowhere like that shown in Figure 5.
Figure 5—Bridge to nowhere in New Zealand
Potentially valid scenarios for building bridges through augmentation abound. For example, two obvious cases are real-time, speech-to-text conversion for the hearing impaired and real-time language translation for travelers—for which menu translation and voice-call transcription offerings from Google could be precursors. But a bridge experience is valid only when social interaction is impossible for some reason—as when people find each other in crowds using their mobile phones, then hang up as soon as they see one another.
Stay Off the Critical Path
Augmentations should be optional for the social interactions and settings they aim to enhance. Introducing an indispensable augmentation into a social interaction could potentially make the augmentation the single point of failure for the entire interaction. Apparently simple interactions like exchanging business cards are often finely nuanced social rituals with many layers of meaning. Think of Patrick Bateman and his colleagues scrutinizing the designs of their minutely different cards. AR designers should note that dozens of technologies and products have tried and failed to augment the exchange of business cards over decades. (Do you Poken? I have two of the devices, but given Poken’s low adoption rate, all I can do is Poken with myself.)
Plaxo, LinkedIn, and many other on-line services manage people’s contact information digitally. But the design of these services complements the essential social interaction of exchanging business cards person-to-person, in the real world.
Until augmented reality is a stable and nearly universal social utility—think of the ubiquity of electricity or electronic funds transfers—it cannot be on the critical path for social interactions. Much simpler, more mature technologies like video conferencing have not yet reached the utility stage, making our social experiences with them notably uneven and people’s judgments of their overall value often low.
Simple Tools for a Complicated World
A host of factors can affect even the most straightforward of social interactions and settings, including personality, mood, history, memory, and the different agendas of the people who are present. When defining the interactions and elements of augmented social experiences, remember that simple designs can successfully integrate with the complexity of people’s decisions and behavior, without directly managing them. On many mobile devices, the convenient buttons that activate the silent-ringing mode and mute conversations—like the mute button Figure 6 shows—provide simple solutions that effectively address the complexities of managing presence, attention, and disruption in social settings and contexts.
Figure 6—The ever-elegant mute button
Avoid the Uncanny Valley
In 1970, roboticist Masahiro Mori identified the “Uncanny Valley,” when he noticed that people interacting with robots that look and act like human beings respond with increasing empathy as the robots become more humanlike. However, when robots reach the point at which they seem very like real humans, though are still identifiably nonhuman, people’s empathic responses to them drop sharply, and people become repulsed by the robots. The name of this phenomenon echoes the shape of the data graph that is depicted in Figure 7.
Figure 7—Masahiro Mori’s “Uncanny Valley”
The exact causes of the Uncanny Valley effect are unknown, but possible explanations include people’s avoiding infection or recognizing genetic abnormalities when choosing a mate. Recent research with monkeys has shown that they have the same pattern of responses, so this effect is common across a broad range of senses for at least two members of the primate family. Very soon, it will be possible to create augmented experiences that incorporate realistic, but still ersatz human faces, voices, and movement that would invoke the Uncanny Valley effect.
An Investment Is a Trade Gone Bad
Securities traders use this expression to make it clear that all parties to a deal must receive something of real value for an exchange to be successful. For traders, this means they must receive something they can use as the basis of another trade rather than something they must hold—as an investment—to gain some uncertain, future benefit.
Likewise, for designers, this means all of the interactions and elements of augmented social experiences must be valuable to all of the people engaging with them. Otherwise, people will perceive the effort and costs of the augmentation as overhead or a burden of some sort. Further, the augmented elements must provide value within the context of a particular interaction rather than only within other contexts or for other purposes.
Are We On-Air?
To paraphrase William Gibson, augmented reality is here, but it is certainly not evenly distributed. People do not yet expect ordinary social interactions and experiences to be mixed realities that include significant augmented elements. Until consumers take mixed reality for granted as the norm, designers must always indicate the presence and status of augmented elements in social AR experiences. Like the On-Air signs in broadcast studios and the sinister red camera eye, shown in Figure 8, that ominously personifies the ubiquitous gaze of HAL 9000 in Stanley Kubrick’s 2001: A Space Odyssey, we must disclose the presence of augmentation in social experiences. The hostile reaction of New York commuters to the candid photos that n_train_gossip posted on Twitter shows how little people appreciate augmented interactions like surreptitious surveillance and the involuntary broadcasting of their already public activities.
Figure 8—HAL 9000
Context Is King
This simple guideline could trump all other AR design principles. Design AR experiences that follow the established norms for behavior and interaction in a social experience you are augmenting. In true inside-out fashion, this might mean that it is appropriate to take cues from those at the very bottom of the Uncanny Valley, as the characters in the zombie comedy Shaun of the Dead realize when they must find a way to temporarily blend in with a large crowd of shambling undead to reach shelter, as shown in Figure 9. Keen-eyed students of behavior and design would note how the group carefully rehearsed their zombie impersonations, coaching one another to properly emulate the inchoate moans, semirandom shuffling, and stilted, slow-motion urgency that are typical of zombies—all to avoid the fatal consequences of bungling their unheimliche, or uncanny, first impression.
Figure 9—Shaun of the Dead heroes, pretending they’re dead
Undiscovered Country Ahead
This starting set of design principles for social augmented experiences is necessarily limited and leaves large areas of future practice unaddressed—for example, how to design social AR experiences that span more than one social context or how to create mature, mixed, social AR experiences that blend elements of several independent social AR experiences. These questions demarcate part of the still undiscovered country that is the everyware experience.
Even so, UX professionals are well equipped to identify and understand the social interactions and situations that could benefit or suffer from augmentation. Successfully turning a social interaction inside out through augmentation requires discovering and addressing the nuances, thresholds, and drivers that are associated with human behavior and emotions—a normal part of the UX design process.
UX professionals have also built the requisite toolkit for designing augmented social interactions. As we augment more social settings and spaces, we can use this design toolkit to explore the relationships between overlapping AR experiences and contexts, as well as the ways AR experiences can affect ordinary reality through proximity and permeation. Because we have the tools we need to imagine and define the conceptual models for the new user experiences that augmenting the real world makes possible, User Experience has an important role to play in helping augmented reality learn to play well with others.