Faceted Metadata for Information Architecture and Search
- Marti Hearst, School of Information, University of California, Berkeley
- Preston Smalley and Corey Chandler, eBay User Experience and Design
Published: June 30, 2006
The CHI 2006 program provided this course summary:
Learn the advantages of and strategies for using faceted metadata for integrating browsing and search of large information collections. Examples are drawn from formal studies and results of real-world applications.
Sometimes first impressions are a great way to gauge the likelihood of a successful experience. This wasn’t one of those times. I was deeply concerned that I’d signed myself up for some esoteric discussion on the proper use of metadata, but pleasantly surprised to find a real-world interface solution for dealing with large information collections—exactly what the summary said this course would cover.
It’s worth noting that I’m a little biased in my glowing review of this course. First, I work for a company (Shopzilla) with more than 28 million products in its inventory. I was, therefore, able to make immediate use of what I learned. Second, one of our primary competitors (eBay) co-presented the course, and I enjoyed this rare opportunity of learning about their inner-workings. Third, we were invited to eat lunch with the presenters between sessions, and I found them all to be likeable, sharing, and knowledgeable people.
Course highlights—independent of the subject matter:
- Marti took many questions both during and after the presentation to keep us engaged throughout the course.
- All of the presenters were well prepared to give their part of the presentation and knew the material inside out.
- The course, while originating in an academic environment, acknowledged the needs of practitioners by showing us how faceted metadata provides a solution that answers real users’ information-foraging problems and demonstrated two real-world applications of this solution: eBay Express and Endeca.
- The slides provided in the course materials contained enough information for someone who didn’t take the course to understand what was presented. This is especially important when reviewing the materials later, either for your own purposes or to share them with your coworkers.
- The course included a twenty-minute exercise, so attendees could make sense of all the different ways in which we could organize even small amounts of data.
A quick overview of the course goals:
- Introduce and explain an objective and systemized approach to creating information architectures for Web sites with large page inventories.
- Show how the approach accounts for the different ways in which users forage for information, based on their own domain knowledge.
- Demonstrate how to apply hierarchical metadata to large page inventories, to facilitate both searching and browsing.
Some key takeaways:
- Create two separate systems of classification for
- navigating categories (taxonomy)
- navigating, or filtering, products/documents (hierarchical metadata)
- Provide a user interface that allows users to
- track what they have selected so far
- easily make changes to their selections, without affecting other selections
- move easily between categories and products/documents
- Be careful to create a user interface that isn’t likely to produce empty result sets.
What Is Faceted Metadata?
Metadata is literally data that describes other data. Metadata has many applications you’re probably already familiar with. Here are some examples:
- Library card catalogs use metadata to define a book beyond its title, providing descriptive data like author, publisher, copyright date, Dewey Decimal number, and so on.
- HTML files use metadata to communicate to search engines the qualities of a Web page that aren’t mentioned specifically in its text such as important keywords, a summary of the content on the page, the author, the tools the author used in creating the page, the formats of files included in the content, etcetera.
- Comparison-shopping engines use metadata to describe, for example, the features of different cameras available in the category Digital Cameras. Some examples of these would be the number of megapixels, zoom, built-in memory, and brand.
In faceted metadata, the metadata exists in a searchable/browseable taxonomy of its own. Each of these taxonomies is called a facet. So, in the example of a library’s metadata, author and publisher are facets. Within facets, there are labels that represent the actual data in each facet. Labels within a facet called Author might include Ernest Hemingway, William Shakespeare, and Edgar Allen Poe.
In hierarchical faceted metadata, some facets are dependent upon other facets. They become available only once you have selected their parent facets.
A Usage Scenario
A complete example of a usage scenario will now be helpful—courtesy of eBay and presented as part of the course.
Usage Scenario: You are looking for a pair of new women’s running shoes. Using your favorite comparison-shopping engine, you browse your way to the Women’s Footwear category via this path in the category taxonomy: Apparel & Accessories > Footwear, then choose the Product Type: Women’s Footwear. Finally, you choose among the different facets of women’s shoes—Shoe Style, Brand, Shoe Size, etcetera. You’re interested in athletic shoes, so under Shoe Style, you choose the label Athletic. After choosing a facet, Shoe Style, you now see a dependent facet, Sub-style. You choose the label Running Shoes.
At this point, a good interface for hierarchical faceted metadata allows you to easily change the labels you’ve chosen, remove them altogether, add a keyword, or even start over somewhere else in the taxonomy.
It’s worth noting that the terms Marti Hearst’s Flamenco research team, eBay, and Endeca have used are not the only terms that can apply to this concept. At Shopzilla, we use slightly different terms. We call our facets Attributes—as in product attributes—and our labels Attribute Values. I’ve heard other terms used in the field as well.
This method of applying metadata, creating facets, and even the user interface presented in the course are not new ideas, but when used together, provide a research-driven, reproducible framework for organizing large collections of information that is applicable within many industries.
Elliott, Ame, Jennifer English, Marti Hearst, Rashmi Sinha, Kirsten Swearingen, and Ka-Ping Yee. “Finding the Flow in Web Site Search.” Communications of the ACM. Volume 45, Issue 9. September 2002. Retrieved on June 29, 2006.
Flamenco Search. “The Flamenco Search Interface Project: Search Interfaces that Flow.“ University of California, Berkeley, School of Information. Retrieved on June 29, 2006.
Hearst, Marti, Kevin Lee, Kirsten Swearingen, and Ka-Ping Yee. “Faceted Metadata for Image Search and Browsing.” Paper presented at CHI 2003, Fort Lauderdale, Florida, April 2003. Retrieved on June 29, 2006.