Workshop Review: HCIR 2010

September 20, 2010

In 2007, I had the pleasure of organizing HCIR 2007, the first Workshop on Human-Computer Interaction and Information Retrieval at MIT. The rallying cry of the workshop was that, while the fields of Human-Computer Interaction (HCI) and Information Retrieval (IR) had both developed innovative techniques to address the challenges of information access, their insights had often failed to cross disciplinary borders.

Subsequent workshops built on the success of the first. Microsoft research hosted HCIR 2008, and their Susan Dumais—who was a coinventor of latent semantic indexing and later won the Gerard Salton Award, the Nobel prize of information retrieval—was a keynote speaker. HCIR 2009 moved the workshop back east to Catholic University and enlisted HCI pioneer Ben Shneiderman as a keynote.

HCIR 2010 showed how much this fledgling workshop has grown up. It attracted sponsorship from Google, Microsoft Research, Endeca, and the Linguistic Data Consortium. Google also supplied a rousing keynote speaker: user experience researcher Dan Russell. But most impressive was the quantity and quality of people who actively participated in this day-long event.

Champion Advertisement
Continue Reading…

The day began with a boaster session, during which all contributors had one minute to pitch their work. Some of these were particularly creative. Cathal Hoare, a PhD student at University College Cork, in Ireland, delivered his boaster as a poem. Sarah Gilbert of Dalhousie University, in Canada, used a Star Wars–themed odyssey to illustrate the challenges that Joe Student faces in writing a research paper. Vladimir Zelevinsky from Endeca let his animated slides speak for themselves—he remained silent for the entire minute. Despite the large number of boasters, the session was engaging and the time went by quickly.

Then came Dan Russell’s keynote, “Why is search sometimes easy and sometimes hard? Dan began by celebrating the successes of our current information retrieval systems. But he quickly dropped the other shoe: today’s systems still fail on a wide variety of tasks. Moreover, text-oriented search engines lag behind our input devices, which now commonly have cameras and microphones. But the bigger problem is that, while search engine developers have worked hard on improving the quality of data and algorithms, too little work has gone into improving the knowledge of users.

For example, 90% of users do not know they can find text in the content on a Web page using the keyboard shortcut Ctrl+F on a PC, Command-F on a Mac, or the equivalent command on the Edit menu. An experiment showed that teaching users to find text could increase their search effectiveness—as measured by the time it takes them to get to the desired result—by 12%, as Figure 1 shows. We are thrilled when an algorithmic improvement achieves a small fraction of that! In general, user expertise lags while search interfaces become richer and more powerful. While it was sobering to see how much users struggle with our current tools, it was encouraging to realize how much opportunity there is to improve user experience and effectiveness through support and education.

Figure 1—The power of teaching users to find text on a Web page
The power of teaching users to find text on a Web page

During the systems session, Klaus Berberich described a system supporting time-based exploration of news archives that researchers at Microsoft and the Max-Planck Institute for Informatics developed. It uses crowdsourcing to annotate timelines with major events. John Stasko spoke about an interactive visualization system researchers at Georgia Tech have developed to help users analyze relationships among entities. Both systems emphasize recall-intensive scenarios that we sometimes forget when focusing on precision-oriented Web search tasks.

After lunch came the poster/demo session. The posters spanned a broad range of topics, from theoretical models of information propagation to analysis of user behavior on faceted search sites. Meanwhile, everyone developing information-retrieval systems offered demos—especially the HCIR Challenge participants. As in previous years, this session was the one attendees found most engaging, and it was difficult to pull people away from their discussions and back to their seats.

The following session focused on models and evaluation. Mark Smucker told us about experiments at the University of Waterloo, where they’re analyzing how retrieval precision affects user’s perceived task difficulty. Pertti Vakkari from the University of Tampere talked about the evaluation of exploratory search. He advocated using evaluation criteria that move beyond focusing on a search system’s output and instead focus on the various stages of the information-seeking process—such as allowing users to elaborate on their information needs.

But my favorite talk this session was from Max Wilson of Swansea University. His presentation described joint work with David Elsweiler of the University of Erlangen to study casual search, which occurs in the absence of an information-seeking goal. While casual-search activities resemble those of exploratory search, casual search defies researchers’ assumptions that users’ unfamiliarity with a domain, system, or information need drives their exploration. Rather, a user’s goal is experiential—to pass the time enjoyably. Much of the activity on social networks fits this description, and our interaction models have to account for this.

Figure 2—Examples of casual search
Examples of casual search

The final session was the highly anticipated HCIR Challenge. Thanks to the generosity of the Linguistic Data Consortium, participants had access to the New York Times Annotated Corpus, a collection of almost two million articles published between 1987 and 2007. They competed to build systems that best embodied the spirit of HCIR: providing users with the flexibility, control, and guidance to effectively, efficiently, and enjoyably perform complex information-seeking tasks. Example task scenarios included learning about a news topic with a long history and understanding the competing perspectives on a controversial issue. Six teams managed to build systems, then the audience voted to choose a winner.

The competitors offered a variety of innovative approaches. Corrado Boscarino, representing a Dutch team from Centrum Wiskunde and Informatica, presented a visual user interface that helps users identify authorities and witnesses for events. Christian Kohlschütter from Leibniz Universität in Hannover demoed NewsClub, which identifies relevant terms for sets of documents and visualizes their network of associations. Raman Chandrasekar from Microsoft Research showed us News Sync, a summarization system that helps users catch up on news from a particular time period or location or about a specific topic. Vladimir Zelevinsky from Endeca demoed a dynamic approach to faceted search that lets users create custom facets. Wei Zheng of the University of Delaware presented a system that combined topic modeling and sentiment analysis to identify and pair up competing perspectives on an issue.

But the winner was Time Explorer, a system Michael Matthews and his colleagues developed at Yahoo! Labs Barcelona. Time Explorer brings an exploratory perspective to the temporal dimension, letting users see the evolution of a topic over time. It goes beyond publication dates, inferring content date by parsing absolute and relative dates and time spans—even in the future—from the article text. The slick visualization allows users to discover unexpected relationships between entities at particular points in time—for example, between Slobodan Milosevic and Saddam Hussein.

In summary, HCIR 2010 offered an exciting day, packed with topics at the nexus of information retrieval and human-computer interaction. With major Web search engines increasingly taking an interest in user interactions, it is HCIR that will shape the future of our search user interfaces. 

Tech Lead at Google

New York, New York, USA

Daniel TunkelangDaniel is a passionate advocate for human-computer information retrieval, an approach that aims to better harness human intelligence in the information-seeking process. Before joining Google, he was a founding employee and Chief Scientist at Endeca, where he developed a faceted search system that has over 250 million users around the world. He designed TunkRank, a robust method for measuring Twitter authority. Daniel has often served as a bridge between academic researchers and industry practitioners. He founded the annual workshops on Human-Computer Interaction and Information Retrieval (HCIR) and has chaired the industry track at ACM SIGIR, the world’s leading conference on information retrieval. Daniel’s book Faceted Search—the first book on faceted search—was published as part of the Morgan & Claypool lecture series. Daniel received a PhD in computer science from Carnegie Mellon University for his work on information visualization.  Read More

Other Articles on Conference Reviews

New on UXmatters