Cloud Computing Speaker Series: Spring 2008

Text Summarization of Review Sentiments

Eric Jensen
Summize.com

11am, March 26, 2008
A.V. Williams 2120

[Slides]

Abstract

The Web provides a unique outlet for individuals to have their voices heard on a variety of different topics. Often this happens in the form of subjective opinions, with authors expressing strong sentiment either in favor or against a product, person, issue, etc. The number of such opinions on the web is growing at an impressive rate, by millions per week in the blogosphere alone, to say nothing of sites that focus on such content like Amazon.com's user reviews or subjective questions on Yahoo Answers. While sentiment analysis has recently gained much attention and review mining became an important topic of research, less emphasis has been put on efficient text summarization of opinion sentiments.

Rather, opinion mining has been narrowly defined as estimating the average polarity of sentiments expressed about various facets of the opinions' target. For products like consumer electronics with a finite set of features, this model enables relative comparison of targets within a given category. However, it is not well-suited to targets with less obvious facets (people, brands, etc.) and does not address the problem of textually summarizing the sentiments for an individual target. For example, reviews of a band's latest album might overwhelmingly say it is "not as good as their first one", or that it is "uninspired". These explain the motivation behind the negative ratings for a particular target where the sentiments don't relate to obvious facets. I will describe an algorithm for textual summarization of review sentiments that scales to very large collections and detail its performance on a dataset comprising several million product reviews. In doing so, I will characterize the consensus building in this massive review pool, and discuss the potential for scaling this algorithm to all of the opinions on the web.

About the Speaker

Eric Jensen is VP of Development at Summize.com, a sentiment mining startup based in Northern Virginia. There, he develops novel and scalable opinion mining and search for blogs, user reviews, and other user-generated content on the web. He co-founded Summize after receiving his Ph.D. in Computer Science from the Illinois Institute of Technology Information Retrieval Lab in 2006. Under a doctoral fellowship from America Online, his work with the lab focused on repeatable evaluation of web search effectiveness, query log analysis and classification, and parallel document clustering algorithms. He is the author of over 30 articles in refereed journals, scientific conferences, workshops, and magazines, and also a U.S. patent application.


About the Series

This page, first created: 17 Jan 2008; last updated: Valid XHTML 1.0! Valid CSS!