Defining Data

Data-Informed Design

Understanding data to achieve great user experiences

December 22, 2014

Despite all the talk about data-informed design, there is not much agreement on what data really means for a product or service’s user experience. That might be because teams don’t yet have a shared language for talking about data, or because access to data is uneven or siloed, or perhaps because team members have different goals for the use of data.

At its core, data-informed design can be difficult to define, because there is not even agreement on what counts as data. We tend to think in dichotomies: quantitative and qualitative, objective and subjective, abstract and sensory, messy and curated, business and user experience, science and story. But the more I work with data and the more familiar I become with the data-science community, the more inclusive my definition of data becomes.

Champion Advertisement

When UX professionals think about data, we are usually thinking about analytics, A/B testing, or at least data that is big enough to be statistically significant. Such approaches work well as ways to discover potential areas for investigation or in weeding out bad ideas. However, there may be a tendency to focus on things that are easy to measure and test rather than using data to discover the big ideas that would lead to breakthroughs. A more meaningful approach would combine a high-level, big-data view with a ground-level view to deliver deep data insights.

Big Data and User Experience

Making sense of big data can seem like a Rorschach ink-blot test into which we project all our hopes and fears. Big data can be the basis of new scientific discoveries, detect terrorist networks, and let us create detailed profiles of customers so we can sell them more stuff. Well, maybe. The reality, according to the article “Hilary Mason Wants to Get You Started with Big Data,” is that big data is just “a data set that is too big to fit into your available memory, or too big to store on your own hard drive, or too big to fit into an Excel spreadsheet.”

A simple working definition that we can use in this column—and one that is relevant for user experience—is that big data is data that machines generate, tallying up what people do and say. It is how many people clicked a link, how many pageviews there were on the day a new campaign started, how many people registered. It is Web server logs, clickstream data, heatmaps, social-media activity, mobile phone calls, ebanking transactions, information that mobile-device sensors capture in an app.

Big data is aggregated behavioral or transactional data; a summary of events. It is what people did, not how they felt about it, why they did it, or even how they did it. There are three key characteristics of big data that have an impact on how we use such data to inform design.

Big data measures user behaviors and actions—for example, Web site or application analytics—as well as words—for example, social-media analytics.
Computers rather than humans collect big data.
Big data uses defined measures.

Because big data documents what has occurred without other humans getting involved in collecting it—other than in creating the data-collection system in the first place!—it feels objective. After all, the more data, the less uncertainty. For example, measuring where just ten people clicked could result in flawed data if all ten of them were distracted, but measuring where thousands of people clicked takes such variation into account to some degree.

Even so, there can be bias in big data sets. Signal bias, where the data set represents a certain subset of people, is a common flaw. For example, while a well-known study, combining Hurricane Sandy-related Twitter and Foursquare data produced some interesting insights, it also gave the impression that Manhattan was the hub of the disaster. In user experience, a simple case of signal bias might be having more extensive analytics for desktop computers than for mobile devices.

Big data can be multistructured—such as Web log data that includes text and images alongside structured transactional information. It can also be unstructured, text-heavy data from metadata and social-media posts. This type of data lends itself to exploration, but can also lead to correlations that prove false. The recent failure of Google Flu Trends, which resulted from Google engineers’ not knowing what linked certain search terms with the spread of the flu, illustrates a correlation-causation gap. (They have since adjusted the algorithm to include CDC data.) An example from user experience would be seeing a correlation between people who searched for a certain keyword, then bounced, and using that data to make inferences about user interface problems.

Big data can be a good starting point for learning about a product or service user experience, but it often generates as many questions as answers. Data is not just numbers; it represents the actions and words of real people with complicated lives. But it lacks context. If big data is the archeology of user experience—or the study of the traces that people leave behind—small data is more like anthropology, exploring people’s lives as they are living them online.

Thick Data and User Experience

Anyone working in user experience has data from studies of various kinds, but no one seems to be clear about what to call it—or whether we should even call it data at all. Small data has become a way to talk about the last mile of big data, or bringing big data insights to the user, but it may not be the best way to characterize study data. Thick data—or sometimes deep data—refers to ethnographic data, which is typically extensive, coded, descriptive data and is closer to what we mean when we talk about the results of UX studies.

Whatever we call this data, it is the kind of data that UX teams intentionally collect in remote or lab usability tests, ethnographies, online studies, and even surveys. Thick data often answers questions that derive from clues that we glean from big data—such as why people are not converting or how people use an app as part of their customer journey. It is about understanding users’ behaviors and feelings, and taking a closer look at how they’re using a site or app.

Analysis of thick data tends to be less systematic and more intuitive. While traditional market research transcribes, then codes people’s responses, UX research tends to bypass that step because of the need for tight turnarounds and multiple iterations. Because thick data generally emerges from custom studies, it’s not generally measured, tracked, and benchmarked. The lifespan and scope of thick data are limited, which can make it difficult to see the big picture.

Why Not Quantitative and Qualitative?

The quantitative / qualitative distinction is not clear cut, so I try to steer away from it. Quantitative is anything with a count—and that includes data from tools like Google analytics, in-house studies such as an intercept survey, and even third-party benchmarks such as a Forrester CX score or Net Promoter Score (NPS). It’s a mix of both data that machines generate, tallying what people do and say, and data from studies, which we create and analyze differently. In the end, we want all the data together, so the differences may not matter that much.

All the Data!

We know that data insights are usually a product of combining methods—for example, pairing ethnography with analytics or usability testing with social-media sentiment analysis. The problem is that there is no easy way to bring all our data together.

The future of data-informed UX design will likely be data that has a layer of intelligence. Currently, there are just a few ways in which organizations are using machine and human intelligence to connect, interpret, and visualize all of their data.

algorithms—Big-data analysis starts with counts and moves to correlations. But big data in combination with new algorithms can create more meaningful data. This is the stuff of the new data science. It is a combination of data mining, or looking for patterns in historical data, and predictive analysis, or creating a model of probabilities. We are very familiar with this kind of algorithm. Examples include Amazon recommendations and Google search results.
frameworks—We can bake analysis into the data by structuring or superimposing a structure on big or small data, creating a framework. In a simple framework, we might map data points to a category—for example, mapping time exploring and happiness to an engagement metric. By mapping data points to a larger framework—for example, Google’s HEART or Avinash Kaushik’s See-Think-Do—we can solve the problem of data in organizational silos and synthesize all of our data in a way that can inform UX design.
storytelling—Obviously, human intelligence continues to be a crucial component of making meaning out of any data. Storytelling is at the core of data science. Creating a visual or textual narrative adds a layer of intelligence that can be compelling and actionable. Data visualization tools help tell a story about big data or quantified thick data. UX teams tell stories by combining big and small data, using personas, and creating customer journey maps.

Big data can certainly inform business decisions—and by extension, design decisions. But in the UX world, as in other fields that study the complex behaviors of people or animals, small data is the norm. The key to creating a better user experience is bringing together all of our data to create actionable insights.

In Big Data | Data-Informed Design

Defining Data

Data-Informed Design

Understanding data to achieve great user experiences

Big Data and User Experience

Thick Data and User Experience

Why Not Quantitative and Qualitative?

All the Data!

No Comments

Join the Discussion

Pamela Pavliscak

Other Columns by Pamela Pavliscak

Other Articles by Pamela Pavliscak

Other Articles on Data-Informed Design

New on UXmatters